Spike the traffic and watch instances scale out on CPU, then scale back in. Feel the two gotchas: cold-start lag, and 50 app instances still choking on one slow database.
Press play, then hit ๐ฅ Spike. Watch instances boot to chase the load โ and notice the requests dropped during the cold-start gap. Then flip on the slow database.
150 req/s
Traffic
150/s
Instances
1
Avg CPU
0%
Dropped
0/s
Instance fleet โ green = serving, amber = cold-starting
๐ฅ๏ธready
Keeping up โ capacity matches demand.
What just happened
โนAutoscaling adds instances when a signal (here, CPU) crosses a threshold and removes them when load falls โ capacity tracks demand instead of being fixed.
โนIt isn't instant: new instances cold-start. During those few seconds the existing instances are over capacity and requests are dropped โ a real spike still bites before scaling catches up.
โนAutoscaling on app CPU is blind to downstream limits. Turn on the slow database and watch requests keep dropping while CPU looks calm โ 8 app instances can't beat one capped database.