microservices

Fault Tolerance vs Resilience (and Bulkhead pattern).

1. Fault Tolerance vs Resilience:

  • Fault Tolerance: The internal ability of a system to continue operating despite a component failure. (Hardware redundancy).
  • Resilience: The external ability of a system to recover from failures and absorb shocks while maintaining service (Software patterns like Circuit Breakers).

2. Bulkhead Pattern:

Iterated from the hull of a ship. It isolates system parts into pools so that if one fails, the others continue.

  • Thread Pool Bulkhead: Limit the number of threads for a specific service call.
  • Semaphore Bulkhead: Limit the number of concurrent executions.

3. Retry with Exponential Backoff:

Instead of retrying immediately, wait longer between each attempt (1s, 2s, 4s...) to give the downstream service time to recover.

Fault Tolerance vs Resilience (and Bulkhead pattern). | DevExCode