All labs
Lab 42
Edge & Chaos

Chaos Engineering — Resilience Dashboard

The whole BookZilla system with live observability — latency, error rate, fallback rate, circuit state, cache hit ratio, DLQ size. Inject chaos (kill a service, add latency, drop the cache, overload CPU) and watch the resilience patterns absorb it while the metrics react. Prove resilience before real failure does.

The whole BookZilla system, live. Inject chaos and watch the dashboard — with resilience on, the patterns from today's labs absorb it; turn them off and watch it fall over.
p95 latency
120 ms
Error rate
0.5%
Fallback rate
0%
Cache hit ratio
88%
DLQ size
0
Flight circuit
CLOSED
✓ All healthy — no chaos injected. This is the easy part; production-readiness is about the next line.

Kill the flight provider with resilience ON: the circuit goes OPEN, fallback rate jumps, error rate barely moves. Toggle resilience OFF with the same chaos and watch latency and errors spike — same failure, very different outcome.

What just happened