Observability for Modern Systems: From Metrics to Traces

Recent Posts

🐧 Solved: Troubleshooting Login and WiFi DNS Issues in antiX Linux January 9, 2026
[RivieraDev2025] Julien Sulpis – What is Color? The Science Behind the Pixels January 8, 2026
[DevoxxGR2025] Nx for Gradle – Faster Builds, Better DX January 7, 2026
[DevoxxFR2025] Spark 4 and Iceberg: The New Standard for All Your Data Projects January 5, 2026
[NDCMelbourne2025] DIY Usability Testing When You Have No Time and No Budget – Bekah Rice January 4, 2026
[DevoxxUK2025] Maven Productivity Tips January 3, 2026
Program of Conferences 2026 January 2, 2026
[KotlinConf2025] LangChain4j with Quarkus January 1, 2026
[DotAI2024] DotAI 2024: Daniel Phiri – Bridging the Multimodal Divide: From Monoliths to Mosaic Mastery December 30, 2025
[DevoxxFR2025] Alert, Everything’s Burning! Mastering Technical Incidents December 28, 2025
[KotlinConf2024] Kotlin Multiplatform Powers Google Workspace December 26, 2025
[DevoxxGR2025] Unmasking Benchmarking Fallacies December 25, 2025
[DevoxxBE2025] Finally, Final Means Final: A Deep Dive into Field Immutability in Java December 25, 2025
[NDCOslo2024] Building a Robot Arm with .NET 8, Raspberry Pi, Blazor and SignalR – Peter Gallagher December 24, 2025
[GoogleIO2025] Adaptive Android development makes your app shine across devices December 21, 2025
[OxidizeConf2024] Moving Electrons with Rust December 21, 2025
[KotlinConf2025] Closing Panel December 18, 2025
[AWSReInventPartnerSessions2024] Simulate COBOL data handling in Java-like structure December 17, 2025
Beyond ELK: A Technical Deep Dive into Splunk, DataDog, and Dynatrace December 16, 2025
[DotJs2025] Prompting is the New Scripting: Meet GenAIScript December 15, 2025

Archives

Observability for Modern Systems: From Metrics to Traces

Author: Jonathan Lalou | September 20, 2025

Good monitoring doesn’t just tell you when things are broken—it explains why.

1) White-Box vs Black-Box Monitoring

White-box: metrics from inside the system (CPU, memory, app metrics). Example: http_server_requests_seconds from Spring Actuator.

Black-box: synthetic probes simulating user behavior (ping APIs, load test flows). Example: periodic “buy flow” test in production.

2) Tracing Distributed Transactions

Use OpenTelemetry to propagate context across microservices:

// Spring Boot setup
implementation "io.opentelemetry:opentelemetry-exporter-otlp:1.30.0"

// Annotate spans
Span span = tracer.spanBuilder("checkout").startSpan();
try (Scope scope = span.makeCurrent()) {
    paymentService.charge(card);
    inventoryService.reserve(item);
} finally {
    span.end();
}

These traces flow into Jaeger or Grafana Tempo to visualize bottlenecks across services.

3) Example Dashboard for a High-Value Service

Availability: % successful requests (SLO vs actual).
Latency: p95/p99 end-to-end response times.
Error Rate: 4xx vs 5xx breakdown.
Dependency Health: DB latency, cache hit ratio, downstream service SLOs.
User metrics: active sessions, checkout success rate.

Posted in en-US | Tags: Monitoring, OpenTelemetry, SRE

Leave a Reply