Recent Posts
Archives

PostHeaderIcon Quick and dirty script to convert WordPress export file to Blogger / Atom XML

I’ve created a Python script that converts WordPress export files to Blogger/Atom XML format. Here’s how to use it:

The script takes two command-line arguments:

  • wordpress_export.xml: Path to your WordPress export XML file
  • blogger_export.xml : Path where you want to save the converted Blogger/Atom XML file

To run the script:

python wordpress_to_blogger.py wordpress_export.xml blogger_export.xml

The script performs the following conversions:

  • Converts WordPress posts to Atom feed entries
  • Preserves post titles, content, publication dates, and authors
  • Maintains categories as Atom categories
  • Handles post status (published/draft)
  • Preserves HTML content formatting
  • Converts dates to ISO format required by Atom

The script uses Python’s built-in xml.etree.ElementTree module for XML processing and includes error handling to make it robust.
Some important notes:

  • The script only converts posts (not pages or other content types)
  • It preserves the HTML content of your posts
  • It maintains the original publication dates
  • It handles both published and draft posts
  • The output is a valid Atom XML feed that Blogger can import

The file:

[python]#!/usr/bin/env python3
import xml.etree.ElementTree as ET
import sys
import argparse
from datetime import datetime
import re

def convert_wordpress_to_blogger(wordpress_file, output_file):
# Parse WordPress XML
tree = ET.parse(wordpress_file)
root = tree.getroot()

# Create Atom feed
atom = ET.Element(‘feed’, {
‘xmlns’: ‘http://www.w3.org/2005/Atom’,
‘xmlns:app’: ‘http://www.w3.org/2007/app’,
‘xmlns:thr’: ‘http://purl.org/syndication/thread/1.0’
})

# Add feed metadata
title = ET.SubElement(atom, ‘title’)
title.text = ‘Blog Posts’

updated = ET.SubElement(atom, ‘updated’)
updated.text = datetime.now().isoformat()

# Process each post
for item in root.findall(‘.//item’):
if item.find(‘wp:post_type’, {‘wp’: ‘http://wordpress.org/export/1.2/’}).text != ‘post’:
continue

entry = ET.SubElement(atom, ‘entry’)

# Title
title = ET.SubElement(entry, ‘title’)
title.text = item.find(‘title’).text

# Content
content = ET.SubElement(entry, ‘content’, {‘type’: ‘html’})
content.text = item.find(‘content:encoded’, {‘content’: ‘http://purl.org/rss/1.0/modules/content/’}).text

# Publication date
pub_date = item.find(‘pubDate’).text
published = ET.SubElement(entry, ‘published’)
published.text = datetime.strptime(pub_date, ‘%a, %d %b %Y %H:%M:%S %z’).isoformat()

# Author
author = ET.SubElement(entry, ‘author’)
name = ET.SubElement(author, ‘name’)
name.text = item.find(‘dc:creator’, {‘dc’: ‘http://purl.org/dc/elements/1.1/’}).text

# Categories
for category in item.findall(‘category’):
category_elem = ET.SubElement(entry, ‘category’, {‘term’: category.text})

# Status
status = item.find(‘wp:status’, {‘wp’: ‘http://wordpress.org/export/1.2/’}).text
if status == ‘publish’:
app_control = ET.SubElement(entry, ‘app:control’, {‘xmlns:app’: ‘http://www.w3.org/2007/app’})
app_draft = ET.SubElement(app_control, ‘app:draft’)
app_draft.text = ‘no’
else:
app_control = ET.SubElement(entry, ‘app:control’, {‘xmlns:app’: ‘http://www.w3.org/2007/app’})
app_draft = ET.SubElement(app_control, ‘app:draft’)
app_draft.text = ‘yes’

# Write the output file
tree = ET.ElementTree(atom)
tree.write(output_file, encoding=’utf-8′, xml_declaration=True)

def main():
parser = argparse.ArgumentParser(description=’Convert WordPress export to Blogger/Atom XML format’)
parser.add_argument(‘wordpress_file’, help=’Path to WordPress export XML file’)
parser.add_argument(‘output_file’, help=’Path to output Blogger/Atom XML file’)

args = parser.parse_args()

try:
convert_wordpress_to_blogger(args.wordpress_file, args.output_file)
print(f"Successfully converted {args.wordpress_file} to {args.output_file}")
except Exception as e:
print(f"Error: {str(e)}")
sys.exit(1)

if __name__ == ‘__main__’:
main()[/python]

PostHeaderIcon Why Project Managers Must Guard Against “Single Points of Failure” in Human Capital

In the world of systems architecture, we’re deeply familiar with the dangers of single points of failure: a server goes down, and suddenly, an entire service collapses. But what about the human side of our operations? What happens when a single employee holds the keys—sometimes literally—to critical infrastructure or institutional knowledge?

As a project manager, you’re not just responsible for timelines and deliverables—you’re also a risk manager. And one of the most insidious risks to any project or company is over-reliance on one individual.


The “Only One Who Knows” Problem

Here are some familiar but risky scenarios:

  • The lead engineer who is the only one with access to production.

  • The architect who built a legacy system but never documented it.

  • The IT admin who’s the sole owner of critical credentials.

  • The contractor who manages deployments but stores scripts only on their local machine.

These situations might feel efficient in the short term—“Let her handle it, she knows it best”—but they are dangerous. Because the moment that person is unavailable (sick leave, resignation, burnout, or worse), your entire project or company is exposed.

This isn’t just about contingency; it’s about resilience.


Human Capital Is Capital

As Peter Drucker famously said, “What gets measured gets managed.” But too often, human capital is not measured or managed with the rigor applied to financial or technical assets.

Yet your people—their knowledge, access, habits—are core infrastructure.

Consider the risks:

  • Operational disruption if a key team member disappears without handover

  • Security vulnerability if credentials are centralized in one individual’s hands

  • Knowledge drain when processes live only in someone’s memory

  • Compliance risk if proper delegation and documentation are missing


Practical Ways to Mitigate the Risk

As a PM or senior tech manager, you can apply several concrete practices to reduce this risk:

1. 📄 Document Everything

  • Maintain centralized and versioned process documentation

  • Include architecture diagrams, deployment workflows, emergency protocols

  • Use internal wikis or documentation tools like Confluence, Notion, or GitBook

2. 👥 Promote Redundancy Through Collaboration

  • Encourage pair programming, shadowing, or “brown bag” sessions

  • Rotate team members through different systems to broaden familiarity

3. 🔄 Rotate Access and Responsibilities

  • Build redundancy into roles—no one should be a bottleneck

  • Use tools like AWS IAM, 1Password, or HashiCorp Vault for shared, audited access

4. 🔎 Test the System Without Them

  • Simulate unavailability scenarios. Can the team deploy without X? Can someone else resolve critical incidents?

  • This is part of operational resiliency planning


A Real-World Example: HSBC’s Core Vacation Policy

When I worked at HSBC, a global financial institution with high security and compliance standards, they enforced a particularly impactful policy:

👉 Every employee or contractor was required to take at least 1 consecutive week of “core vacation” each year.

The reasons were twofold:

  1. Operational Resilience: To ensure that no person was irreplaceable, and teams could function in their absence.

  2. 🚨 Fraud Detection: Continuous presence often masks subtle misuse of systems or privileges. A break allows for behaviors to be reviewed or irregularities to surface.

This policy, common in banking and finance, is a brilliant example of using absence as a testing mechanism—not just for risk, but for trust and transparency.


Building Strong People and Even Stronger Systems

Let’s be clear: this is not about making people “replaceable.”
This is about making systems sustainable and protecting your team from burnout, stress, and unrealistic dependence.

You want to:

  • ✅ Respect your team’s contribution

  • ✅ Protect them from overexposure

  • ✅ Ensure your project or company remains healthy and functional

As the CTO of Basecamp, David Heinemeier Hansson, once said:

“People should be able to take a real vacation without the company collapsing. If they can’t, it’s a leadership failure, not a workforce problem.”


Further Reading and Resources

PostHeaderIcon Understanding volatile in Java: A Deep Dive with a Cloud-Native Use Case

In the modern cloud-native world, concurrency is no longer a niche concern. Whether you’re building scalable microservices in Kubernetes, deploying serverless functions in AWS Lambda, or writing multithreaded backend services in Java, thread safety is a concept you must understand deeply.

Among Java’s many concurrency tools, the volatile keyword stands out as both simple and powerful—yet often misunderstood.

This article provides a comprehensive look at volatile, including real-world cloud-based scenarios, a complete Java example, and important caveats every developer should know.

What Does volatile Mean in Java?

At its core, the volatile keyword in Java is used to ensure visibility of changes to variables across threads.

  • Guarantees read/write operations are done directly from and to main memory, avoiding local CPU/thread caches.
  • Ensures a “happens-before” relationship, meaning changes to a volatile variable by one thread are visible to all other threads that read it afterward.

❌ The Problem volatile Solves

Let’s consider the classic issue: Thread A updates a variable, but Thread B doesn’t see it due to caching.

public class ServerStatus {
    private static boolean isRunning = true;

    public static void main(String[] args) throws InterruptedException {
        Thread monitor = new Thread(() -> {
            while (isRunning) {
                // still running...
            }
            System.out.println("Service stopped.");
        });

        monitor.start();
        Thread.sleep(1000);
        isRunning = false;
    }
}

Under certain JVM optimizations, Thread B might never see the change, causing an infinite loop.

✅ Using volatile to Fix the Visibility Issue

public class ServerStatus {
    private static volatile boolean isRunning = true;

    public static void main(String[] args) throws InterruptedException {
        Thread monitor = new Thread(() -> {
            while (isRunning) {
                // monitor
            }
            System.out.println("Service stopped.");
        });

        monitor.start();
        Thread.sleep(1000);
        isRunning = false;
    }
}

This change ensures all threads read the latest value of isRunning from main memory.

☁️ Cloud-Native Use Case: Gracefully Stopping a Health Check Monitor

Now let’s ground this with a real-world cloud-native example. Suppose a Spring Boot microservice runs a background thread that polls the health of cloud instances (e.g., EC2 or GCP VMs). On shutdown—triggered by a Kubernetes preStop hook—you want the monitor to exit cleanly.

public class CloudHealthMonitor {

    private static volatile boolean running = true;

    public static void main(String[] args) {
        Thread healthThread = new Thread(() -> {
            while (running) {
                pollHealthCheck();
                sleep(5000);
            }
            System.out.println("Health monitoring terminated.");
        });

        healthThread.start();

        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("Shutdown signal received.");
            running = false;
        }));
    }

    private static void pollHealthCheck() {
        System.out.println("Checking instance health...");
    }

    private static void sleep(long millis) {
        try {
            Thread.sleep(millis);
        } catch (InterruptedException ignored) {}
    }
}

This approach ensures your application exits gracefully, cleans up properly, and avoids unnecessary errors or alerts in monitoring systems.

⚙️ How volatile Works Behind the Scenes

Java allows compilers and processors to reorder instructions for optimization. This can lead to unexpected results in multithreaded contexts.

volatile introduces memory barriers that prevent instruction reordering and force flushes to/from main memory, maintaining predictable behavior.

Common Misconceptions

  • volatile makes everything thread-safe!” ❌ False. It provides visibility, not atomicity.
  • “Use volatile instead of synchronized Only for simple flags. Use synchronized for compound logic.
  • volatile is faster than synchronized ✅ Often true—but only if used appropriately.

When Should You Use volatile?

✔ Use it for:

  • Flags like running, shutdownRequested
  • Read-mostly config values that are occasionally changed
  • Safe publication in single-writer, multi-reader setups

✘ Avoid for:

  • Atomic counters (use AtomicInteger)
  • Complex inter-thread coordination
  • Compound read-modify-write operations

✅ Summary Table

Feature volatile
Visibility Guarantee ✅ Yes
Atomicity Guarantee ❌ No
Lock-Free ✅ Yes
Use for Flags ✅ Yes
Use for Counters ❌ No
Cloud Relevance ✅ Graceful shutdowns, health checks

Conclusion

In today’s cloud-native Java ecosystem, understanding concurrency is essential. The volatile keyword—though simple—offers a reliable way to ensure thread visibility and safe signaling across threads.

Whether you’re stopping a background process, toggling a configuration flag, or signaling graceful shutdowns, volatile remains an invaluable tool for writing correct, responsive, and cloud-ready code.

What About You?

Have you used volatile in a critical system before? Faced tricky visibility bugs? Share your insights in the comments!

Related Reading

PostHeaderIcon Advanced Encoding in Java, Kotlin, Node.js, and Python

Encoding is essential for handling text, binary data, and secure transmission across applications. Understanding advanced encoding techniques can help prevent data corruption and ensure smooth interoperability across systems. This post explores key encoding challenges and how Java/Kotlin, Node.js, and Python tackle them.


1️⃣ Handling Special Unicode Characters (Emoji, Accents, RTL Text)

Java/Kotlin

Java uses UTF-16 internally, but for external data (JSON, databases, APIs), explicit encoding is required:

String text = "🔧 Café مرحبا";
byte[] utf8Bytes = text.getBytes(StandardCharsets.UTF_8);
String decoded = new String(utf8Bytes, StandardCharsets.UTF_8);
System.out.println(decoded); // 🔧 Café مرحبا

Tip: Always specify StandardCharsets.UTF_8 to avoid platform-dependent defaults.

Node.js

const text = "🔧 Café مرحبا";
const utf8Buffer = Buffer.from(text, 'utf8');
const decoded = utf8Buffer.toString('utf8');
console.log(decoded); // 🔧 Café مرحبا

Tip: Using an incorrect encoding (e.g., latin1) may corrupt characters.

Python

text = "🔧 Café مرحبا"
utf8_bytes = text.encode("utf-8")
decoded = utf8_bytes.decode("utf-8")
print(decoded)  # 🔧 Café مرحبا

Tip: Python 3 handles Unicode by default, but explicit encoding is always recommended.


2️⃣ Encoding Binary Data for Transmission (Base64, Hex, Binary Files)

Java/Kotlin

byte[] data = "Hello World".getBytes(StandardCharsets.UTF_8);
String base64Encoded = Base64.getEncoder().encodeToString(data);
byte[] decoded = Base64.getDecoder().decode(base64Encoded);
System.out.println(new String(decoded, StandardCharsets.UTF_8)); // Hello World

Node.js

const data = Buffer.from("Hello World", 'utf8');
const base64Encoded = data.toString('base64');
const decoded = Buffer.from(base64Encoded, 'base64').toString('utf8');
console.log(decoded); // Hello World

Python

import base64
data = "Hello World".encode("utf-8")
base64_encoded = base64.b64encode(data).decode("utf-8")
decoded = base64.b64decode(base64_encoded).decode("utf-8")
print(decoded)  # Hello World

Tip: Base64 encoding increases data size (~33% overhead), which can be a concern for large files.


3️⃣ Charset Mismatches and Cross-Language Encoding Issues

A file encoded in ISO-8859-1 (Latin-1) may cause garbled text when read using UTF-8.

Java/Kotlin Solution:

byte[] bytes = Files.readAllBytes(Paths.get("file.txt"));
String text = new String(bytes, StandardCharsets.ISO_8859_1);

Node.js Solution:

const fs = require('fs');
const text = fs.readFileSync("file.txt", { encoding: "latin1" });

Python Solution:

with open("file.txt", "r", encoding="ISO-8859-1") as f:
    text = f.read()

Tip: Always specify encoding explicitly when working with external files.


4️⃣ URL Encoding and Decoding

Java/Kotlin

String encoded = URLEncoder.encode("Hello World!", StandardCharsets.UTF_8);
String decoded = URLDecoder.decode(encoded, StandardCharsets.UTF_8);

Node.js

const encoded = encodeURIComponent("Hello World!");
const decoded = decodeURIComponent(encoded);

Python

from urllib.parse import quote, unquote
encoded = quote("Hello World!")
decoded = unquote(encoded)

Tip: Use UTF-8 for URL encoding to prevent inconsistencies across different platforms.


Conclusion: Choosing the Right Approach

  • Java/Kotlin: Strong type safety, but requires careful Charset management.
  • Node.js: Web-friendly but depends heavily on Buffer conversions.
  • Python: Simple and concise, though strict type conversions must be managed.

📌 Pro Tip: Always be explicit about encoding when handling external data (APIs, files, databases) to avoid corruption.

 

PostHeaderIcon Mastering DNS Configuration: A, AAAA, CNAME, and Best Practices with OVH

I am currently reorganizing a website of mine, hosted at OVHcloud, and it is worth reminding some concepts and best practices related to DNS.

(disclaimer: I am not part of OVH at all, I express myself as a mere customer)

DNS (Domain Name System) is the backbone of the internet, translating human-friendly domain names into IP addresses that computers understand. Yet, many website owners and IT professionals struggle with its configuration. Let’s break down the essential DNS records—A, AAAA, and CNAME—and illustrate best practices using OVH’s interface.

Key DNS Records Explained

1️⃣ A Record (Address Record)

  • Maps a domain (e.g., example.com) to an IPv4 address (e.g., 192.168.1.1).
  • Best practice: Ensure you update this if your server IP changes.

2️⃣ AAAA Record (IPv6 Address Record)

  • Similar to A records but maps to an IPv6 address (e.g., 2001:db8::1).
  • Best practice: If your hosting provider supports IPv6, use this alongside A records for better future-proofing.

3️⃣ CNAME Record (Canonical Name Record)

  • Points a domain (e.g., blog.example.com) to another domain (example.wordpress.com).
  • Best practice: Use CNAME for aliases but avoid pointing the root domain (example.com) to another domain using CNAME—stick to A/AAAA records.

Configuring DNS Records in OVH

To set up a subdomain (blog.example.com) on OVH:

  1. Log in to your OVH Control Panel.
  2. Navigate to Web Cloud → Domains, then select your domain.
  3. Go to the DNS Zone tab and click Add an entry.
  4. Choose A Record if your blog has a dedicated IPv4, or CNAME if pointing to another domain.
  5. Enter your subdomain (blog) and the corresponding IP or domain.
  6. Save changes and wait for propagation (~24 hours max).

Best Practices for DNS Management

  • Use TTL (Time-To-Live) wisely: Lower values (e.g., 300s) allow faster updates but increase queries to your DNS provider.
  • Keep DNS records minimal: Avoid unnecessary CNAME chains to improve resolution speed.
  • Secure with DNSSEC: If your registrar supports it, enable DNSSEC to prevent DNS spoofing.
  • Regularly review DNS settings: Especially after migrations, new SSL configurations, or changes in hosting.

PostHeaderIcon Efficient Inter-Service Communication with Feign and Spring Cloud in Multi-Instance Microservices

In a world where systems are becoming increasingly distributed and cloud-native, microservices have emerged as the de facto architecture. But as we scale
microservices horizontally—running multiple instances for each service—one of the biggest challenges becomes inter-service communication.

How do we ensure that our services talk to each other reliably, efficiently, and in a way that’s resilient to failures?

Welcome to the world of Feign and Spring Cloud.


The Challenge: Multi-Instance Microservices

Imagine you have a user-service that needs to talk to an order-service, and your order-service runs 5 instances behind a
service registry like Eureka. Hardcoding URLs? That’s brittle. Manual load balancing? Not scalable.

You need:

  • Service discovery to dynamically resolve where to send the request
  • Load balancing across instances
  • Resilience for timeouts, retries, and fallbacks
  • Clean, maintainable code that developers love

The Solution: Feign + Spring Cloud

OpenFeign is a declarative web client. Think of it as a smart HTTP client where you only define interfaces — no more boilerplate REST calls.

When combined with Spring Cloud, Feign becomes a first-class citizen in a dynamic, scalable microservices ecosystem.

✅ Features at a Glance:

  • Declarative REST client
  • Automatic service discovery (Eureka, Consul)
  • Client-side load balancing (Spring Cloud LoadBalancer)
  • Integration with Resilience4j for circuit breaking
  • Easy integration with Spring Boot config and observability tools

Step-by-Step Setup

1. Add Dependencies

[xml][/xml]

If using Eureka:

[xml][/xml]


2. Enable Feign Clients

In your main Spring Boot application class:

[java]@SpringBootApplication
@EnableFeignClients
public <span>class <span>UserServiceApplication { … }
[/java]


3. Define Your Feign Interface

[java]
@FeignClient(name = "order-service")
public interface OrderClient { @GetMapping("/orders/{id}")
OrderDTO getOrder(@PathVariable("id") Long id); }
[/java]

Spring will automatically:

  • Register this as a bean
  • Resolve order-service from Eureka
  • Load-balance across all its instances

4. Add Resilience with Fallbacks

You can configure a fallback to handle failures gracefully:

[java]

@FeignClient(name = "order-service", fallback = OrderClientFallback.class)
public interface OrderClient {
@GetMapping("/orders/{id}") OrderDTO getOrder(@PathVariable Long id);
}[/java]

The fallback:

[java]

@Component
public class OrderClientFallback implements OrderClient {
@Override public OrderDTO getOrder(Long id) {
return new OrderDTO(id, "Fallback Order", LocalDate.now());
}
}[/java]


⚙️ Configuration Tweaks

Customize Feign timeouts in application.yml:

[yml]

feign:

    client:

       config:

           default:

                connectTimeout:3000

                readTimeout:500

[/yml]

Enable retry:

[xml]
feign:
client:
config:
default:
retryer:
maxAttempts: 3
period: 1000
maxPeriod: 2000
[/xml]


What Happens Behind the Scenes?

When user-service calls order-service:

  1. Spring Cloud uses Eureka to resolve all instances of order-service.
  2. Spring Cloud LoadBalancer picks an instance using round-robin (or your chosen strategy).
  3. Feign sends the HTTP request to that instance.
  4. If it fails, Resilience4j (or your fallback) handles it gracefully.

Observability & Debugging

Use Spring Boot Actuator to expose Feign metrics:

[xml]

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency[/xml]

And tools like Spring Cloud Sleuth + Zipkin for distributed tracing across Feign calls.


Beyond the Basics

To go even further:

  • Integrate with Spring Cloud Gateway for API routing and external access.
  • Use Spring Cloud Config Server to centralize configuration across environments.
  • Secure Feign calls with OAuth2 via Spring Security and OpenID Connect.

✨ Final Thoughts

Using Feign with Spring Cloud transforms service-to-service communication from a tedious, error-prone task into a clean, scalable, and cloud-native solution.
Whether you’re scaling services across zones or deploying in Kubernetes, Feign ensures your services communicate intelligently and resiliently.

PostHeaderIcon Problem: Spring JMS MessageListener Stuck / Not Receiving Messages

Scenario

A Spring Boot application using ActiveMQ with @JmsListener suddenly stops receiving messages after running for a while. No errors in logs, and the queue keeps growing, but the consumers seem idle.

Setup

@JmsListener(destination = "myQueue", concurrency = "5-10") public void processMessage(String message) { log.info("Received: {}", message); }
  • ActiveMQConnectionFactory was used.

  • The queue (myQueue) was filling up.

  • Restarting the app temporarily fixed the issue.


Investigation

  1. Checked ActiveMQ Monitoring (Web Console)

    • Messages were enqueued but not dequeued.

    • Consumers were still active, but not processing.

  2. Thread Dump Analysis

    • Found that listener threads were stuck in a waiting state.

    • The problem only occurred under high load.

  3. Checked JMS Acknowledgment Mode

    • Default AUTO_ACKNOWLEDGE was used.

    • Suspected an issue with message acknowledgment.

  4. Enabled Debug Logging

    • Added:

      logging.level.org.springframework.jms=DEBUG
    • Found repeated logs like:

      JmsListenerEndpointContainer#0-1 received message, but no further processing
    • This hinted at connection issues.

  5. Tested with a Different Message Broker

    • Using Artemis JMS instead of ActiveMQ resolved the issue.

    • Indicated that it was broker-specific.


Root Cause

ActiveMQ’s TCP connection was silently dropped, but the JMS client did not detect it.

  • When the connection is lost, DefaultMessageListenerContainer doesn’t always recover properly.

  • ActiveMQ does not always notify clients of broken connections.

  • No exceptions were thrown because the connection was technically “alive” but non-functional.


Fix

  1. Enabled keepAlive in ActiveMQ connection

    ActiveMQConnectionFactory factory = new ActiveMQConnectionFactory(); factory.setUseKeepAlive(true); factory.setOptimizeAcknowledge(true); return factory;
  2. Forced Reconnection with Exception Listener

    • Implemented:

      factory.setExceptionListener(exception -> { log.error("JMS Exception occurred, reconnecting...", exception); restartJmsListener(); });
    • This ensured that if a connection was dropped, the listener restarted.

  3. Switched to DefaultJmsListenerContainerFactory with DMLC

    • SimpleMessageListenerContainer was less reliable in handling reconnections.

    • New Configuration:

      @Bean public DefaultJmsListenerContainerFactory jmsListenerContainerFactory( ConnectionFactory connectionFactory) { DefaultJmsListenerContainerFactory factory = new DefaultJmsListenerContainerFactory(); factory.setConnectionFactory(connectionFactory); factory.setSessionTransacted(true); factory.setErrorHandler(t -> log.error("JMS Listener error", t)); return factory; }

Final Outcome

✅ After applying these fixes, the issue never reoccurred.
🚀 The app remained stable even under high load.


Key Takeaways

  • Silent disconnections in ActiveMQ can cause message listeners to hang.

  • Enable keepAlive and optimizeAcknowledge for reliable connections.

  • Use DefaultJmsListenerContainerFactory with DMLC instead of SMLC.

  • Implement an ExceptionListener to restart the JMS connection if necessary.

 

PostHeaderIcon How to Bypass Elasticsearch’s 10,000-Result Limit with the Scroll API

If you’ve ever worked with the Elasticsearch API, you’ve likely run into its infamous 10,000-result limit. It’s a default cap that can feel like a brick wall when you’re dealing with large datasets—think log analysis, report generation, or bulk data exports. Fortunately, there’s a slick workaround: the Scroll API. In this post, I’ll walk you through why this limit exists, how the Scroll API solves it, and share practical examples to get you started.

Why the 10,000-Result Limit Exists

Elasticsearch caps standard search results at 10,000 to protect performance. Fetching millions of records in one shot with from and size parameters can strain memory and slow things down. But what if you need all that data? That’s where the Scroll API shines—it’s designed for deep pagination, letting you retrieve everything in manageable chunks.

What Is the Scroll API?

Unlike a typical search, the Scroll API maintains a temporary “scroll context” on the server. You grab a batch of results, get a scroll_id, and use it to fetch the next batch—no need to rerun your query. It’s efficient, scalable, and perfect for big data tasks.

How to Use the Scroll API: Step by Step

Let’s break it down with examples you can try yourself.

Step 1: Start the Scroll

Kick things off with a search request. Add the scroll parameter (like 1m for a 1-minute timeout) and set size to control your batch size. Here’s a basic example:
GET /my_index/_search?scroll=1m
{
  "size": 1000,
  "query": {
    "match_all": {}
  }
}
This pulls the first 1,000 hits and returns a `scroll_id`—a long, encoded string you’ll need for the next step.

Step 2: Fetch More Results

Using that `scroll_id`, request the next batch. You don’t need to repeat the query—just send the ID and timeout:
POST /_search/scroll
{
  "scroll": "1m",
  "scroll_id": "c2NhbjsxMDAwO...YOUR_SCROLL_ID_HERE..."
}
Loop this call until you’ve retrieved all your data. Each response includes a new `scroll_id` (sometimes the same, depending on the version), so keep updating it.

Step 3: Clean Up

When you’re done, delete the scroll context to free up server resources. It’s a small but critical step:
DELETE /_search/scroll/c2NhbjsxMDAwO...YOUR_SCROLL_ID_HERE...

Skip this, and you’ll leave dangling contexts that could bog down your cluster.

A Real-World Example

Let’s say you’re sifting through millions of logs for a specific error. Here’s a targeted scroll query:
GET /logs/_search?scroll=2m
{
  "size": 500,
  "query": {
    "match": {
      "error_message": "timeout"
    }
  }
}

Then, use the Scroll API to paginate through every matching log entry. It’s way cleaner than hacking around with `from` and `size`.
Tips for Scroll API Success
  • Batch Size: Stick to a `size` like 500–1000. Too large, and you’ll strain memory; too small, and you’ll make too many requests.
  • Timeout Tuning: Set the scroll duration (e.g., `1m`, `5m`) based on how fast your script processes each batch. Too short, and the context expires mid-run.
  • Automation: Use a script to handle the loop. Python’s `elasticsearch` library, for instance, has a handy scroll helper:
from elasticsearch import Elasticsearch

es = Elasticsearch(["http://localhost:9200"])
scroll = es.search(index="logs", scroll="2m", size=500, body={"query": {"match": {"error_message": "timeout"}}})
scroll_id = scroll["_scroll_id"]

while len(scroll["hits"]["hits"]):
    print(scroll["hits"]["hits"])  # Process this batch
    scroll = es.scroll(scroll_id=scroll_id, scroll="2m")
    scroll_id = scroll["_scroll_id"]

es.clear_scroll(scroll_id=scroll_id)  # Cleanup

Why Scroll Beats the Alternatives

You could tweak `index.max_result_window` to raise the limit, but that’s a performance gamble. Export tools or aggregations might work for summaries, but for raw data retrieval, Scroll is king—efficient and built for the job.

Conclusion

The Scroll API has been a game-changer for my Elasticsearch projects, especially when wrestling with massive indices. It’s simple once you get the hang of it, and the payoff is huge.

PostHeaderIcon Elastic APM: When to Use @CaptureSpan vs. @CaptureTransaction?

If you’re working with Elastic APM in a Java application, you might wonder when to use `@CaptureSpan` versus `@CaptureTransaction`. Both are powerful tools for observability, but they serve different purposes.
🔹 `@CaptureTransaction`:
Use this at the entry point of a request, typically at a controller, service method, or a background job. It defines the start of a transaction and allows you to trace how a request propagates through your system.
🔹 `@CaptureSpan`:
Use this to track sub-operations within a transaction, such as database queries, HTTP calls, or specific business logic. It helps break down execution time and pinpoint performance bottlenecks inside a transaction.

📌 Best Practices:

✅ Apply @CaptureTransaction at the highest-level method handling a request.
✅ Use @CaptureSpan for key internal operations you want to monitor.
✅ Avoid excessive spans—instrument only critical code paths to reduce overhead.

By balancing these annotations effectively, you can get detailed insights into your app’s performance while keeping APM overhead minimal.

 

PostHeaderIcon Java’s Emerging Role in AI and Machine Learning: Bridging the Gap to Production

While Python dominates in model training, Java is becoming increasingly vital for deploying and serving AI/ML models in production. Its performance, stability, and enterprise integration capabilities make it a strong contender.

Java Example: Real-time Object Detection with DL4J and OpenCV

[java]
import …

public class ObjectDetection {

public static void main(String[] args) {
String modelPath = "yolov3.weights";
String configPath = "yolov3.cfg";
String imagePath = "image.jpg";
Net net = Dnn.readNet(modelPath, configPath);
Mat image = imread(imagePath);
Mat blob = Dnn.blobFromImage(image, 1 / 255.0, new Size(416, 416), new Scalar(0, 0, 0), true, false);

net.setInput(blob);

MatVector detections = net.forward(); // Inference

// Process detections (bounding boxes, classes, confidence)
// … (complex logic for object detection results)
// Draw bounding boxes on the image
// … (OpenCV drawing functions)
imwrite("detected_objects.jpg", image);
}
}

[/java]

Python Example: Similar Object Detection with OpenCV and YOLO

[python]

import numpy as np

net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
image = cv2.imread("image.jpg")
blob = cv2.dnn.blobFromImage(image, 1/255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
detections = net.forward()

# Process detections (bounding boxes, classes, confidence)
# … (simpler logic, NumPy arrays)
# Draw bounding boxes on the image
# … (OpenCV drawing functions)
cv2.imwrite("detected_objects.jpg", image)
[/python]

Comparison and Insights:

  • Syntax and Readability: Python’s syntax is generally more concise and readable for data science and AI tasks. Java, while more verbose, offers strong typing and better performance for production deployments.
  • Library Ecosystem: Python’s ecosystem (NumPy, OpenCV, TensorFlow, PyTorch) is more mature and developer-friendly for AI/ML development. Java, with libraries like DL4J, is catching up, but its strength lies in enterprise integration and performance.
  • Performance: Java’s performance is often superior to Python’s, especially for real-time inference and high-throughput applications.
  • Enterprise Integration: Java’s ability to seamlessly integrate with existing enterprise systems (databases, message queues, APIs) is a significant advantage.
  • Deployment: Java’s deployment capabilities are more robust, making it suitable for mission-critical AI applications.

Key Takeaways:

  • Python is excellent for rapid prototyping and model training.
  • Java excels in deploying and serving AI/ML models in production environments, where performance and reliability are paramount.
  • The choice between Java and Python depends on the specific use case and requirements.