Split data across DB servers by shard key and watch writes distribute. Then break it: a bad key melts one hot shard while others idle, and a cross-shard join turns ugly.
One database can't take ChatSphere's write volume anymore. Split the data across 4 shards — but the shard key you pick decides whether load spreads evenly or melts one shard.
Shard key
2000 writes/s
each shard handles 1000 writes/s
🗄️
Shard A
500/s
🗄️
Shard B
500/s
🗄️
Shard C
500/s
🗄️
Shard D
500/s
Load spread evenly across all shards. This is what a good shard key buys you.
Now query the sharded data
Resolves the shard key to ONE shard (B) and reads only there. Fast — this is sharding working as intended.
What just happened
▹Sharding splits one dataset across several database servers so writes and storage scale past what a single node can hold. Each shard owns a slice of the keys.
▹Everything rides on the shard key. A good key (like a hash of user_id) spreads load evenly; a skewed key (like country) overloads one hot shard while the others idle.
▹Queries that target the shard key hit one shard and stay fast. Queries that span all keys — a global feed, a cross-shard join — must fan out to every shard and merge, which is slow and awkward. That's why sharding is a last resort, not a first step.