Database Sharding: When, Why, and How

Sharding splits your database horizontally across multiple servers, each holding a subset of the data. It’s the nuclear option for scaling — incredibly powerful, but it introduces complexity that most teams underestimate. Before you shard, make sure you’ve exhausted every other optimization: indexing, query tuning, read replicas, caching, and vertical scaling.

The most common mistake: sharding too early. Teams shard at 100GB when vertical scaling easily handles 2-5TB. Premature sharding adds months of engineering overhead for problems that don’t exist yet.

When You Actually Need Sharding

Symptom	Try First	Shard If
Slow queries	Indexing, query optimization	Already optimized, table has 1B+ rows
Write bottleneck	Vertical scaling, write batching	Single node can’t handle write IOPS
Storage limits	Archiving, partitioning	Dataset exceeds single-node storage
Multi-region latency	Read replicas per region	Writes must be region-local
Connection limits	Connection pooling (PgBouncer)	Still hitting single-server limits

Rule of thumb: If your database is under 500GB and under 10K writes/sec, you probably don’t need sharding.

Optimization Ladder (Do These First)

Step	Effort	Impact	Try Before
1. Add proper indexes	Hours	10-100x query speed	Anything else
2. Query optimization	Days	2-10x throughput	Read replicas
3. Read replicas	Days	Scales reads to N nodes	Caching
4. Application-level caching	Week	Reduces DB load 50-90%	Vertical scaling
5. Vertical scaling	Hours	2-4x capacity	Partitioning
6. Table partitioning	Week	Efficient range queries	Sharding
7. Sharding	Months	Infinite horizontal scale	Last resort

Sharding Strategies

1. Hash-Based Sharding

Distribute rows by hashing the shard key. Provides even distribution.

import hashlib

def get_shard(tenant_id: str, num_shards: int = 8) -> int:
    """Consistent hash-based shard selection."""
    hash_val = int(hashlib.md5(tenant_id.encode()).hexdigest(), 16)
    return hash_val % num_shards

# All data for tenant_123 goes to the same shard
shard = get_shard("tenant_123")  # Returns 0-7
connection = shard_connections[shard]
connection.execute("SELECT * FROM orders WHERE tenant_id = %s", "tenant_123")

Pros: Even data distribution, simple implementation Cons: Range queries require scatter-gather, adding shards requires rehashing

2. Range-Based Sharding

Split data by value ranges — common for time-series and ordered data.

from datetime import datetime

SHARD_RANGES = [
    (datetime(2025, 1, 1), 0),
    (datetime(2025, 7, 1), 1),
    (datetime(2026, 1, 1), 2),
    (datetime(2026, 7, 1), 3),
]

def get_time_shard(created_at: datetime) -> int:
    for boundary, shard_id in reversed(SHARD_RANGES):
        if created_at >= boundary:
            return shard_id
    return 0

Pros: Range queries hit single shard, natural data archiving Cons: Hot spots (newest shard gets all writes), uneven distribution

3. Geographic Sharding

Route data by region for compliance and latency:

GEO_SHARDS = {
    "us": {"host": "db-us-east.example.com", "regions": ["US", "CA", "MX"]},
    "eu": {"host": "db-eu-west.example.com", "regions": ["DE", "FR", "GB", "NL"]},
    "apac": {"host": "db-apac.example.com", "regions": ["JP", "AU", "SG", "KR"]},
}

def get_geo_shard(user_country: str) -> dict:
    for shard_name, config in GEO_SHARDS.items():
        if user_country in config["regions"]:
            return config
    return GEO_SHARDS["us"]  # Default

Pros: Data residency compliance (GDPR), low latency for regional users Cons: Cross-region queries are slow, uneven load distribution

4. Directory-Based Sharding

A lookup table maps each entity to its shard:

-- Shard directory table (on a separate, small database)
CREATE TABLE shard_directory (
    entity_id VARCHAR(64) PRIMARY KEY,
    shard_id INT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Lookup before every query
SELECT shard_id FROM shard_directory WHERE entity_id = 'tenant_123';

Pros: Maximum flexibility, easy rebalancing Cons: Single point of failure (directory DB), extra lookup per query

Strategy Selection Guide

Strategy	Best For	Avoid When
Hash	Even distribution across all shards	Need range queries on shard key
Range	Time-series, ordered data, archiving	High write volume on latest range
Geographic	GDPR compliance, regional latency	Users access cross-region data frequently
Directory	Flexible tenant placement, VIP routing	Ultra-low-latency lookups required

Shard Key Selection

The shard key is the most critical architectural decision. Choose wrong and you’ll spend months re-sharding.

Good Shard Keys

Key	Why	Best For
`tenant_id`	All tenant data colocated	Multi-tenant SaaS
`user_id`	User data isolation	Social platforms, user-centric apps
`region`	Geographic isolation	Compliance-driven apps
`order_id` (hashed)	Even distribution	High-volume transaction systems

Bad Shard Keys

Key	Why It Fails
`created_at`	All writes hit the newest shard (hot spot)
`status`	Low cardinality — uneven distribution
`email`	Frequently changes, requires shard migration
Auto-increment `id`	Sequential — all writes go to one shard

Compound Shard Keys

Shard key: (tenant_id, HASH(order_id))

Benefits:
- All tenant data stays on same set of shards (locality)
- Orders within tenant are distributed (no hot spots)
- Queries for a specific tenant don't scatter across all shards

Cross-Shard Query Patterns

Scatter-Gather

Query all shards in parallel, merge results:

import asyncio

async def search_all_shards(query: str, params: dict) -> list:
    tasks = [
        shard.execute(query, params)
        for shard in all_shards
    ]
    results = await asyncio.gather(*tasks)
    merged = []
    for result_set in results:
        merged.extend(result_set)
    return sorted(merged, key=lambda r: r['created_at'], reverse=True)[:100]

Cost: N round-trips (one per shard). Acceptable for dashboards, unacceptable for real-time APIs.

Global Tables

Keep small, frequently-joined reference data replicated across all shards:

Global tables (replicated to every shard):
├── countries
├── currencies
├── feature_flags
└── pricing_tiers

Sharded tables (distributed by tenant_id):
├── users
├── orders
├── invoices
└── activity_logs

Rebalancing

When shards become unevenly loaded, you need to move data between them.

Live Rebalancing Process

1. Create new shard(s)
2. Begin dual-writing to old and new shard
3. Backfill historical data to new shard
4. Verify data consistency (checksums)
5. Switch reads to new shard
6. Stop writing to old shard
7. Clean up old shard data

Consistent Hashing

Minimizes data movement when adding/removing shards:

import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, nodes: list, virtual_nodes: int = 150):
        self.ring = {}
        self.sorted_keys = []

        for node in nodes:
            for i in range(virtual_nodes):
                key = self._hash(f"{node}:{i}")
                self.ring[key] = node
                self.sorted_keys.append(key)

        self.sorted_keys.sort()

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def get_node(self, key: str) -> str:
        h = self._hash(key)
        idx = bisect_right(self.sorted_keys, h) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

# Adding a new node only moves ~1/N of the data
ring = ConsistentHashRing(["shard-1", "shard-2", "shard-3"])
ring.get_node("tenant_123")  # -> "shard-2"

Alternatives to Application-Level Sharding

Solution	How It Works	When to Use
PostgreSQL table partitioning	Split table by range/list/hash within one server	<1TB, need partition pruning
Citus (PostgreSQL)	Distributed PostgreSQL with transparent sharding	Want SQL compatibility
Vitess (MySQL)	Sharding middleware for MySQL	YouTube-scale MySQL workloads
CockroachDB	Distributed SQL with auto-sharding	Multi-region, strong consistency
MongoDB	Native sharding with `sh.shardCollection()`	Document workloads

Failure Modes and Recovery

Failure	Impact	Recovery Strategy
Single shard down	Partial outage (1/N of data unavailable)	Replica promotion, automatic failover
Routing layer down	Full outage (no requests reach any shard)	Redundant routing, health checks
Shard key regret	Expensive re-sharding required	Dual-write migration to new key
Unbalanced shards	Hot shard degrades, others idle	Rebalance with consistent hashing
Cross-shard deadlock	Queries hang, cascading timeouts	Timeout policies, circuit breakers

Shard Monitoring Dashboard Metrics

Track these per shard to detect problems early:

QPS per shard — Should be roughly equal; more than 2x difference signals imbalance
P99 latency per shard — Rising latency on one shard indicates resource pressure
Disk usage per shard — Uneven growth means your shard key has skewed distribution
Connection count per shard — Connection pooling should distribute evenly
Replication lag — If using read replicas per shard, lag over 1s needs investigation

Sharding Checklist

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database scaling consulting, visit garnetgrid.com. :::