Jonathan Lalou's Blog

Building Resilient Architectures: Patterns That Survive Failure

Author: Jonathan Lalou | October 20, 2025

How to design systems that gracefully degrade, recover quickly, and scale under pressure.

1) Patterns for Graceful Degradation

When dependencies fail, your system should still provide partial service. Examples:

Show cached product data if the pricing service is down.
Allow “read-only” mode if writes are failing.
Provide degraded image quality if the CDN is unavailable.

2) Circuit Breakers

Prevent cascading failures with Resilience4j or Hystrix:

@CircuitBreaker(name = "inventoryService", fallbackMethod = "fallbackInventory")
public Inventory getInventory(String productId) {
    return restTemplate.getForObject("/inventory/" + productId, Inventory.class);
}

public Inventory fallbackInventory(String productId, Throwable t) {
    return new Inventory(productId, 0);
}

3) Retries with Backoff

Retries should be bounded and spaced out:

@Retry(name = "paymentService", fallbackMethod = "fallbackPayment")
public PaymentResponse processPayment(PaymentRequest req) {
    return restTemplate.postForObject("/pay", req, PaymentResponse.class);
}

RetryConfig config = RetryConfig.custom()
    .maxAttempts(3)
    .waitDuration(Duration.ofMillis(200))
    .intervalFunction(IntervalFunction.ofExponentialBackoff(200, 2.0, 0.5)) // jitter
    .build();

4) Scaling Microservices in Kubernetes/ECS

Scaling is not just replicas—it’s smart policies:

Kubernetes HPA: Scale pods based on CPU or custom metrics (e.g., p95 latency).
```
kubectl autoscale deployment api --cpu-percent=70 --min=3 --max=10
```
ECS: Use Service Auto Scaling with CloudWatch alarms on queue depth.
Pre-warm caches: Scale up before big events (e.g., Black Friday).

Posted in en-US | Tags: CircuitBreaker, GracefulDegradation, Jonathan Lalou, SRE