Sharding & the Shard Key

Split data across DB servers by shard key and watch writes distribute. Then break it: a bad key melts one hot shard while others idle, and a cross-shard join turns ugly.

One database can't take ChatSphere's write volume anymore. Split the data across 4 shards — but the shard key you pick decides whether load spreads evenly or melts one shard.

Shard key

✍️ Write load2000 writes/s

each shard handles 1000 writes/s

🗄️

Shard A

500/s

🗄️

Shard B

500/s

🗄️

Shard C

500/s

🗄️

Shard D

500/s

Load spread evenly across all shards. This is what a good shard key buys you.

Now query the sharded data

Resolves the shard key to ONE shard (B) and reads only there. Fast — this is sharding working as intended.

What just happened

▹Sharding splits one dataset across several database servers so writes and storage scale past what a single node can hold. Each shard owns a slice of the keys.
▹Everything rides on the shard key. A good key (like a hash of user_id) spreads load evenly; a skewed key (like country) overloads one hot shard while the others idle.
▹Queries that target the shard key hit one shard and stay fast. Queries that span all keys — a global feed, a cross-shard join — must fan out to every shard and merge, which is slow and awkward. That's why sharding is a last resort, not a first step.