Serksa
All Concepts
Performance & Scaling

Database Sharding

1

What is it?

<strong>Sharding</strong> is splitting a large database into smaller pieces (shards) distributed across multiple servers. Each shard contains a subset of the data.

2

Think of it like...

The Library Branches

Like splitting a huge library into branches by alphabet, sharding splits data across servers.

📚

Shard A

Books A-M

📚

Shard B

Books N-Z

🔍

Router

Knows which branch

3

Visual Flow

👤User Request

Find user data

🧭Shard Router

Directs to right shard

💾Shard 3

Has this user's data

4

Where you see it

1

Choose Shard Key

Decide how to split data (by user ID, location, etc.)

2

Distribute Data

Split data across multiple database servers

3

Route Requests

Application knows which shard has the data

4

Query Shard

Only query the relevant shard, not all data

5

Return Result

Much faster since searching smaller dataset

5

Common Mistake

Wrong

Sharding is the same as replication

Correct

Replication copies the same data to multiple servers (for backup/speed). Sharding splits different data across servers (for scale). They solve different problems.

💡 Real-World Example

Instagram shards by user ID:

1

Shard 1: Users with ID 1-1,000,000

2

Shard 2: Users with ID 1,000,001-2,000,000

3

Shard 3: Users with ID 2,000,001-3,000,000

4

Your ID is 1,500,000 → Always goes to Shard 2

5

Billions of users, but each shard handles manageable size