Verified by Garnet Grid

Database Sharding: When, Why, and How

A comprehensive guide to horizontal database sharding. Covers sharding keys, strategies (hash, range, geo), cross-shard queries, rebalancing, and when NOT to shard.

Sharding splits your database horizontally across multiple servers, each holding a subset of the data. It’s the nuclear option for scaling — incredibly powerful, but it introduces complexity that most teams underestimate. Before you shard, make sure you’ve exhausted every other optimization: indexing, query tuning, read replicas, caching, and vertical scaling.

The most common mistake: sharding too early. Teams shard at 100GB when vertical scaling easily handles 2-5TB. Premature sharding adds months of engineering overhead for problems that don’t exist yet.


When You Actually Need Sharding

SymptomTry FirstShard If
Slow queriesIndexing, query optimizationAlready optimized, table has 1B+ rows
Write bottleneckVertical scaling, write batchingSingle node can’t handle write IOPS
Storage limitsArchiving, partitioningDataset exceeds single-node storage
Multi-region latencyRead replicas per regionWrites must be region-local
Connection limitsConnection pooling (PgBouncer)Still hitting single-server limits

Rule of thumb: If your database is under 500GB and under 10K writes/sec, you probably don’t need sharding.

Optimization Ladder (Do These First)

StepEffortImpactTry Before
1. Add proper indexesHours10-100x query speedAnything else
2. Query optimizationDays2-10x throughputRead replicas
3. Read replicasDaysScales reads to N nodesCaching
4. Application-level cachingWeekReduces DB load 50-90%Vertical scaling
5. Vertical scalingHours2-4x capacityPartitioning
6. Table partitioningWeekEfficient range queriesSharding
7. ShardingMonthsInfinite horizontal scaleLast resort

Sharding Strategies

1. Hash-Based Sharding

Distribute rows by hashing the shard key. Provides even distribution.

import hashlib

def get_shard(tenant_id: str, num_shards: int = 8) -> int:
    """Consistent hash-based shard selection."""
    hash_val = int(hashlib.md5(tenant_id.encode()).hexdigest(), 16)
    return hash_val % num_shards

# All data for tenant_123 goes to the same shard
shard = get_shard("tenant_123")  # Returns 0-7
connection = shard_connections[shard]
connection.execute("SELECT * FROM orders WHERE tenant_id = %s", "tenant_123")

Pros: Even data distribution, simple implementation Cons: Range queries require scatter-gather, adding shards requires rehashing

2. Range-Based Sharding

Split data by value ranges — common for time-series and ordered data.

from datetime import datetime

SHARD_RANGES = [
    (datetime(2025, 1, 1), 0),
    (datetime(2025, 7, 1), 1),
    (datetime(2026, 1, 1), 2),
    (datetime(2026, 7, 1), 3),
]

def get_time_shard(created_at: datetime) -> int:
    for boundary, shard_id in reversed(SHARD_RANGES):
        if created_at >= boundary:
            return shard_id
    return 0

Pros: Range queries hit single shard, natural data archiving Cons: Hot spots (newest shard gets all writes), uneven distribution

3. Geographic Sharding

Route data by region for compliance and latency:

GEO_SHARDS = {
    "us": {"host": "db-us-east.example.com", "regions": ["US", "CA", "MX"]},
    "eu": {"host": "db-eu-west.example.com", "regions": ["DE", "FR", "GB", "NL"]},
    "apac": {"host": "db-apac.example.com", "regions": ["JP", "AU", "SG", "KR"]},
}

def get_geo_shard(user_country: str) -> dict:
    for shard_name, config in GEO_SHARDS.items():
        if user_country in config["regions"]:
            return config
    return GEO_SHARDS["us"]  # Default

Pros: Data residency compliance (GDPR), low latency for regional users Cons: Cross-region queries are slow, uneven load distribution

4. Directory-Based Sharding

A lookup table maps each entity to its shard:

-- Shard directory table (on a separate, small database)
CREATE TABLE shard_directory (
    entity_id VARCHAR(64) PRIMARY KEY,
    shard_id INT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Lookup before every query
SELECT shard_id FROM shard_directory WHERE entity_id = 'tenant_123';

Pros: Maximum flexibility, easy rebalancing Cons: Single point of failure (directory DB), extra lookup per query

Strategy Selection Guide

StrategyBest ForAvoid When
HashEven distribution across all shardsNeed range queries on shard key
RangeTime-series, ordered data, archivingHigh write volume on latest range
GeographicGDPR compliance, regional latencyUsers access cross-region data frequently
DirectoryFlexible tenant placement, VIP routingUltra-low-latency lookups required

Shard Key Selection

The shard key is the most critical architectural decision. Choose wrong and you’ll spend months re-sharding.

Good Shard Keys

KeyWhyBest For
tenant_idAll tenant data colocatedMulti-tenant SaaS
user_idUser data isolationSocial platforms, user-centric apps
regionGeographic isolationCompliance-driven apps
order_id (hashed)Even distributionHigh-volume transaction systems

Bad Shard Keys

KeyWhy It Fails
created_atAll writes hit the newest shard (hot spot)
statusLow cardinality — uneven distribution
emailFrequently changes, requires shard migration
Auto-increment idSequential — all writes go to one shard

Compound Shard Keys

Shard key: (tenant_id, HASH(order_id))

Benefits:
- All tenant data stays on same set of shards (locality)
- Orders within tenant are distributed (no hot spots)
- Queries for a specific tenant don't scatter across all shards

Cross-Shard Query Patterns

Scatter-Gather

Query all shards in parallel, merge results:

import asyncio

async def search_all_shards(query: str, params: dict) -> list:
    tasks = [
        shard.execute(query, params)
        for shard in all_shards
    ]
    results = await asyncio.gather(*tasks)
    merged = []
    for result_set in results:
        merged.extend(result_set)
    return sorted(merged, key=lambda r: r['created_at'], reverse=True)[:100]

Cost: N round-trips (one per shard). Acceptable for dashboards, unacceptable for real-time APIs.

Global Tables

Keep small, frequently-joined reference data replicated across all shards:

Global tables (replicated to every shard):
├── countries
├── currencies
├── feature_flags
└── pricing_tiers

Sharded tables (distributed by tenant_id):
├── users
├── orders
├── invoices
└── activity_logs

Rebalancing

When shards become unevenly loaded, you need to move data between them.

Live Rebalancing Process

1. Create new shard(s)
2. Begin dual-writing to old and new shard
3. Backfill historical data to new shard
4. Verify data consistency (checksums)
5. Switch reads to new shard
6. Stop writing to old shard
7. Clean up old shard data

Consistent Hashing

Minimizes data movement when adding/removing shards:

import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, nodes: list, virtual_nodes: int = 150):
        self.ring = {}
        self.sorted_keys = []

        for node in nodes:
            for i in range(virtual_nodes):
                key = self._hash(f"{node}:{i}")
                self.ring[key] = node
                self.sorted_keys.append(key)

        self.sorted_keys.sort()

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def get_node(self, key: str) -> str:
        h = self._hash(key)
        idx = bisect_right(self.sorted_keys, h) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

# Adding a new node only moves ~1/N of the data
ring = ConsistentHashRing(["shard-1", "shard-2", "shard-3"])
ring.get_node("tenant_123")  # -> "shard-2"

Alternatives to Application-Level Sharding

SolutionHow It WorksWhen to Use
PostgreSQL table partitioningSplit table by range/list/hash within one server<1TB, need partition pruning
Citus (PostgreSQL)Distributed PostgreSQL with transparent shardingWant SQL compatibility
Vitess (MySQL)Sharding middleware for MySQLYouTube-scale MySQL workloads
CockroachDBDistributed SQL with auto-shardingMulti-region, strong consistency
MongoDBNative sharding with sh.shardCollection()Document workloads

Failure Modes and Recovery

FailureImpactRecovery Strategy
Single shard downPartial outage (1/N of data unavailable)Replica promotion, automatic failover
Routing layer downFull outage (no requests reach any shard)Redundant routing, health checks
Shard key regretExpensive re-sharding requiredDual-write migration to new key
Unbalanced shardsHot shard degrades, others idleRebalance with consistent hashing
Cross-shard deadlockQueries hang, cascading timeoutsTimeout policies, circuit breakers

Shard Monitoring Dashboard Metrics

Track these per shard to detect problems early:

  • QPS per shard — Should be roughly equal; more than 2x difference signals imbalance
  • P99 latency per shard — Rising latency on one shard indicates resource pressure
  • Disk usage per shard — Uneven growth means your shard key has skewed distribution
  • Connection count per shard — Connection pooling should distribute evenly
  • Replication lag — If using read replicas per shard, lag over 1s needs investigation

Sharding Checklist

  • Exhausted single-node optimizations first (indexes → replicas → caching → vertical)
  • Shard key chosen based on access patterns (not just even distribution)
  • Strategy selected (hash, range, geo, directory) based on query patterns
  • Cross-shard query patterns identified and cost accepted
  • Global/reference tables identified for replication
  • Monitoring per-shard metrics (size, QPS, latency, connection count)
  • Rebalancing strategy documented (consistent hashing or directory)
  • Backup/restore tested per-shard
  • Application routing layer tested for failover
  • Data consistency verification automated (checksums, row counts)
  • Rollback plan documented in case of shard key regret

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database scaling consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →