Database Connection Pooling

Database connections are expensive. Each PostgreSQL connection consumes roughly 10MB of RAM. Opening a new connection takes 50-100ms of handshake time. Under load, applications that create a new connection per request will exhaust the database’s connection limit, causing cascading failures. Connection pooling solves this by reusing a fixed set of connections.

Why Pooling Matters

Without pooling:
  100 app instances × 10 concurrent requests = 1,000 connections
  PostgreSQL default max_connections = 100
  Result: "FATAL: too many connections for role"
  
  Each new connection: TCP handshake + TLS + auth = 50-100ms overhead

With pooling:
  100 app instances × 1 pool (size 10) = 100 connections total
  Each request borrows a connection, returns it after query
  Connection reuse: 0ms overhead (already established)
  
  Same throughput, 10x fewer connections

Pool Configuration

# Application-level connection pool (SQLAlchemy)
from sqlalchemy import create_engine

engine = create_engine(
    "postgresql://user:pass@db:5432/mydb",
    
    # Pool size: Maximum sustained concurrent queries
    pool_size=10,
    
    # Max overflow: Extra connections allowed during bursts
    max_overflow=5,
    # Total max connections: pool_size + max_overflow = 15
    
    # Pool timeout: How long to wait for available connection
    pool_timeout=30,  # seconds. Raise error after 30s of waiting
    
    # Recycle: Close and reopen connections after this time
    pool_recycle=1800,  # 30 minutes. Prevents stale connections
    
    # Pre-ping: Verify connection is alive before using
    pool_pre_ping=True,  # Small overhead, prevents "connection reset" errors
)

PgBouncer

; pgbouncer.ini — External connection pooler for PostgreSQL

[databases]
mydb = host=db-primary port=5432 dbname=mydb

[pgbouncer]
; Pool mode:
;   session:     Connection held for entire client session (least efficient)
;   transaction: Connection held for single transaction (recommended)
;   statement:   Connection held for single statement (most efficient, limitations)
pool_mode = transaction

; Pool sizing
default_pool_size = 25        ; Connections per user/database pair
min_pool_size = 5             ; Keep at least 5 connections warm
reserve_pool_size = 5         ; Extra connections for burst traffic
reserve_pool_timeout = 3      ; Seconds before using reserve pool

; Connection limits
max_client_conn = 1000        ; Max client connections to PgBouncer
max_db_connections = 50       ; Max connections TO PostgreSQL

; Timeouts
server_idle_timeout = 600     ; Close idle server connections after 10 min
client_idle_timeout = 0       ; Never close idle client connections
query_timeout = 30            ; Kill queries running longer than 30s

; Health checks
server_check_query = SELECT 1
server_check_delay = 30       ; Check every 30 seconds

Pool Sizing Formula

Optimal pool size ≈ (core_count * 2) + effective_spindle_count

Examples:
  4-core CPU, SSD:         (4 * 2) + 1 = 9 connections
  8-core CPU, SSD:         (8 * 2) + 1 = 17 connections
  16-core CPU, SSD:        (16 * 2) + 1 = 33 connections

Common mistake:
  "We have 1000 concurrent users, so we need 1000 connections"
  Reality: Most requests are not actively querying the DB
  A pool of 20-50 connections handles 1000+ concurrent users

Rule of thumb:
  Start with (2 * CPU cores) + 1
  Monitor: pool utilization, wait time, overflow count
  Adjust based on actual workload

Anti-Patterns

Anti-Pattern	Consequence	Fix
No connection pooling	Connection exhaustion under load	Use application pool or PgBouncer
Pool too large	Wastes database RAM, lock contention	Size based on CPU cores, not user count
Pool too small	Requests wait for connections, latency	Monitor pool wait time, increase if > 0
No pool_pre_ping	”Connection reset” errors after idle	Enable pre-ping or health checks
No pool_recycle	Stale connections cause errors	Recycle every 30-60 minutes
Hold connections during I/O	Block pool with non-DB work	Return connections immediately after query

Connection pooling is the easiest performance win for database-backed applications. A properly sized pool can handle 100x more traffic than raw connections, with better latency and lower database load.