Database Replication Topologies

Database replication copies data from one server to others to achieve high availability, disaster recovery, and read scalability. But replication introduces complexity: lag, conflicts, split-brain scenarios, and consistency tradeoffs. Understanding replication topologies is essential for designing reliable data systems.

Replication Topologies

Single-Leader (Primary-Replica):
  ┌──────────┐    Replication    ┌──────────┐
  │ Primary  │ ────────────────→ │ Replica 1│ (read-only)
  │ (writes) │ ────────────────→ │ Replica 2│ (read-only)
  └──────────┘                   └──────────┘
  
  Writes: Primary only
  Reads: Primary + Replicas
  Failover: Promote a replica to primary
  Best for: Read-heavy workloads, simple architecture
  
Multi-Leader (Active-Active):
  ┌──────────┐    ←→    ┌──────────┐
  │ Leader 1 │          │ Leader 2 │
  │ (US-East)│ ←──────→ │ (EU-West)│
  └──────────┘          └──────────┘
  
  Writes: Any leader
  Conflict: Same row modified on both leaders simultaneously
  Resolution: Last-write-wins, custom merge, CRDT
  Best for: Multi-region, offline-capable apps
  
Leaderless (Quorum):
  ┌──────────┐  ┌──────────┐  ┌──────────┐
  │  Node 1  │  │  Node 2  │  │  Node 3  │
  └──────────┘  └──────────┘  └──────────┘
  
  Write: To W of N nodes (e.g., 2 of 3)
  Read: From R of N nodes (e.g., 2 of 3)
  Rule: W + R > N guarantees overlap (latest value)
  Best for: High availability, partition tolerance

Replication Lag

# Monitoring replication lag

# PostgreSQL: Check replication lag
"""
SELECT 
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn,
    pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replication_lag_bytes,
    EXTRACT(EPOCH FROM (NOW() - pg_last_xact_replay_timestamp())) AS lag_seconds
FROM pg_stat_replication;
"""

# Read-your-writes consistency pattern
class ReadYourWritesRouter:
    """Ensure user reads their own recent writes."""
    
    def route_read(self, user_id: str, query: str):
        last_write = self.get_last_write_time(user_id)
        replica_lag = self.get_replica_lag()
        
        if last_write and replica_lag > 0:
            time_since_write = time.time() - last_write
            if time_since_write < replica_lag + 1:
                # Recent write — read from primary
                return self.primary.execute(query)
        
        # Safe to read from replica
        return self.replica.execute(query)

Failover Strategies

Automatic Failover (Managed Services):
  AWS RDS: Automatic failover in ~60-120 seconds
  Azure SQL: Automatic failover groups with DNS redirect
  Cloud SQL: Regional HA with automatic failover
  
Manual Failover (Self-Managed):
  1. Detect primary failure (health checks, heartbeat)
  2. Verify primary is truly down (avoid split-brain)
  3. Select promotion candidate (most up-to-date replica)
  4. Promote replica to primary
  5. Redirect application connections
  6. Reconfigure remaining replicas to follow new primary
  
  Risk: Data loss if async replication and primary failed before
        replicas caught up.
  
  Mitigation: Synchronous replication (zero data loss, higher latency)

Anti-Patterns

Anti-Pattern	Consequence	Fix
Ignore replication lag	Stale reads, user confusion	Monitor lag, route critical reads to primary
No failover testing	First failover is during an outage	Regular failover drills
Multi-leader without conflict resolution	Data corruption, lost writes	Explicit conflict resolution strategy
Sync replication everywhere	High latency, reduced availability	Sync for critical data, async for the rest
No monitoring of replica health	Silent replica failure	Alert on lag, connection status, query errors

Replication is the foundation of high availability, but it introduces complexity that must be actively managed. Choose the simplest topology that meets your requirements, and always monitor replication health.