Database Sharding

Sharding splits a large database into smaller, more manageable pieces called shards. Each shard holds a subset of the data, determined by a shard key (like user ID or tenant ID). Queries are routed to the appropriate shard based on this key, and each shard operates as an independent database instance.

The primary benefit is horizontal scalability. When a single database server can no longer handle the load, sharding distributes both data and query traffic across multiple servers. This allows the system to scale linearly by adding more shards as data grows.

Sharding introduces significant complexity: cross-shard queries are expensive, rebalancing shards when data distribution is uneven is operationally challenging, and application logic must be shard-aware. Many teams find that read replicas, caching layers, and vertical scaling defer the need for sharding. When sharding becomes necessary, choosing the right shard key is the most critical decision, as it determines data distribution and query routing efficiency.

Related Terms

A/B Testing

Feature Flag

MLOps

Model Serving

Semantic Search

CI/CD (Continuous Integration / Continuous Deployment)