Verified by Garnet Grid

Distributed Database Architecture

Design distributed database systems. Covers CAP theorem, consensus protocols, sharding strategies, replication topologies, conflict resolution, and choosing between consistency and availability.

Distributed databases trade simplicity for scalability. A single PostgreSQL instance handles millions of rows effortlessly, but at some point you need more throughput, more storage, or more availability than one server provides. That’s when you distribute data across multiple nodes — and everything gets harder.


CAP Theorem in Practice

PropertyWhat It MeansExamples
ConsistencyAll reads return most recent writePostgreSQL, CockroachDB, Spanner
AvailabilityEvery request gets a responseCassandra, DynamoDB
Partition ToleranceSystem works despite network failuresAll distributed systems (mandatory)

You can only pick two (realistically: CP or AP):

ChoiceTrade-offUse Case
CP (Consistent + Partition-tolerant)Unavailable during partitionFinancial transactions, inventory
AP (Available + Partition-tolerant)May return stale data during partitionSocial feeds, analytics, caching

Replication Topologies

TopologyConsistencyAvailabilityLatencyUse Case
Single-leaderStrongMediumLow (local reads)Most applications
Multi-leaderEventualHighLow (local writes)Multi-region apps
LeaderlessTunable (quorum)HighestVariesHigh-availability systems

Sharding Strategies

StrategyHowProsCons
RangeShard by key range (A-M, N-Z)Range queries efficientHot spots (uneven distribution)
HashHash(key) → shardEven distributionRange queries cross all shards
DirectoryLookup table maps key → shardFlexibleLookup table is single point of failure
GeographicShard by regionData locality, complianceCross-region queries expensive

Database Selection Guide

DatabaseTypeConsistencyBest For
PostgreSQLRelational (single-node)StrongMost workloads < 1TB
CockroachDBDistributed SQLStrong (serializable)Global SQL, strong consistency
VitessSharded MySQLStrong (per-shard)Scaling MySQL horizontally
CassandraWide-columnTunableHigh-write, time-series
DynamoDBKey-value / documentEventual (or strong per-item)Serverless, predictable perf
MongoDBDocumentTunable (causal)Document workloads, flexible schema
ScyllaDBWide-column (C++)TunableCassandra-compatible, lower latency

Anti-Patterns

Anti-PatternProblemFix
Shard before you need toComplexity without benefitScale up first (bigger instance), shard only when necessary
Cross-shard transactionsSlow, complex, failure-proneDesign schema to keep related data on same shard
Distributed joinsScatter-gather across all shardsDenormalize, materialize views
Ignoring data localityUser data spread across distant shardsShard by tenant/user, geo-shard if multi-region

Checklist

  • Data volume and growth assessed (do you actually need distribution?)
  • CAP trade-off chosen: CP or AP based on business requirements
  • Replication topology selected and tested for failure scenarios
  • Sharding key chosen to avoid hot spots and cross-shard queries
  • Conflict resolution strategy for multi-leader/leaderless
  • Backup and recovery tested for distributed setup
  • Monitoring: replication lag, shard balance, query routing

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database architecture consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →