Redundancy & Failover

Run primary + backup as active-passive or active-active, with health checks watching the primary. Kill the primary and watch traffic fail over to the backup — with the real brief blip in between — then compare the two redundancy modes on cost and recovery.

Two nodes, traffic flowing. Choose active-passive or active-active, then kill the primary and watch how (and how fast) the system fails over.

📨 traffic→load balancer + health check

🖥️

Node A

primary

serving

🖥️

Node B

backup (standby)

standby

Primary serving; backup on standby (idle).

Requests served

Dropped in failover

Mode

Active-Passive

cheaper, has a blip

Kill the primary in active-passive and watch the dropped-request count tick up during detection. Switch to active-active, kill a node, and dropped stays flat — at the cost of always running both.

What just happened

▹Redundancy means more than one component can do the job; failover is the act of switching to a backup when the primary dies. Health checks watch the primary and trigger the switch.
▹Active-passive keeps a backup on standby. It's cheaper to reason about, but the backup sits idle (wasted capacity) and failover isn't instant — there's a detection + promotion blip where requests fail.
▹Active-active runs both nodes serving traffic. Lose one and the survivor carries on with no real downtime — but you pay for double capacity always, and after a failure you're running at reduced headroom.