Retry a flaky call and watch a naive retry storm pile onto a recovering service. Switch on exponential backoff, then add jitter to de-sync the herd — and see why you must NOT retry a declined card. Then an idempotency key stops a retried charge from billing twice.
A provider just hiccuped and 12 clients want to retry. Watch how the retry strategy decides whether the recovering service heals — or gets buried.
Load on the recovering service (per second)capacity 4/s
time →
Retry this failure?
🔁
503 Service Unavailable
Retry — transient — it may recover
🔁
Timeout
Retry — transient — retry with backoff
🔁
429 Too Many Requests
Retry — back off, then retry
🛑
400 Bad Request
Don't retry — the request is wrong — retry can't fix it
🛑
401 Unauthorized
Don't retry — bad credentials — retry can't fix it
🛑
Card declined (no funds)
Don't retry — a business decision, not a glitch
Retries create duplicates — unless the request is idempotent
Sent 0× · charges applied 0 · total ₹0
What just happened
▹Retrying immediately turns a blip into an outage: every client hammers the recovering service at once — a retry storm — and keeps it down. Always back off.
▹Exponential backoff (1s, 2s, 4s…) helps, but if every client backs off on the same clock they stampede together. Add jitter — a random offset — so retries spread out and the service can actually drain.
▹Only retry transient failures (timeouts, 5xx, 429). Never retry a 400/401 or a declined card — the answer won't change, and you just add load.
▹Retries make duplicates possible. An idempotency key lets the server recognize a repeat and return the first result instead of charging twice.