Connection Pooling and Resource Management

Every external resource your application connects to — databases, caches, message queues, HTTP APIs — requires a connection. Creating connections is expensive (TCP handshake, TLS negotiation, authentication). Destroying and recreating connections per request wastes time and resources. Connection pooling solves this by maintaining a set of pre-established connections that requests borrow and return.

How Connection Pools Work

Application Thread Pool
  Thread 1 ─── borrow ──▶ ┌──────────────┐
  Thread 2 ─── borrow ──▶ │  Connection  │
  Thread 3 ─── waiting    │    Pool       │ ──▶ Database
  Thread 4 ─── borrow ──▶ │  (10 conns)  │
  Thread 5 ─── waiting    └──────────────┘

Pool Lifecycle

1. Application starts → pool creates `minIdle` connections
2. Request arrives → pool provides an idle connection (or creates one if under `maxSize`)
3. Request completes → connection returned to pool (not closed)
4. Pool at max + request arrives → request waits (up to `connectionTimeout`)
5. Connection idle too long → pool closes it (down to `minIdle`)
6. Application shuts down → pool closes all connections

Pool Sizing

The Formula

The optimal pool size is not “as many as possible.” It is:

pool_size = (core_count * 2) + effective_spindle_count

For SSDs with no spindle latency: pool_size ≈ core_count * 2 + 1

An 8-core server needs approximately 17 connections, not 100.

Why Smaller Is Better

Scenario A: 100 connections, 50 active queries
  - Database context-switches between 50 transactions
  - Each query takes 100ms due to contention
  - Throughput: 500 queries/second

Scenario B: 20 connections, 20 active queries
  - Database runs 20 queries with minimal contention
  - Each query takes 20ms
  - Throughput: 1,000 queries/second

More connections does not mean more throughput. It means more contention.

Configuration Parameters

# HikariCP (Java) — Gold standard for connection pools
dataSource:
  maximumPoolSize: 20       # Max connections
  minimumIdle: 5            # Min idle connections maintained
  connectionTimeout: 30000  # Max wait for a connection (ms)
  idleTimeout: 600000       # Idle connection lifetime (ms)
  maxLifetime: 1800000      # Max connection lifetime (ms)
  validationTimeout: 5000   # Health check timeout (ms)
  leakDetectionThreshold: 60000  # Log warning if not returned in 60s

# SQLAlchemy (Python)
engine = create_engine(
    "postgresql://...",
    pool_size=20,             # Max connections
    max_overflow=5,           # Burst beyond pool_size
    pool_timeout=30,          # Wait time for connection
    pool_recycle=1800,        # Recycle connections after 30min
    pool_pre_ping=True,       # Validate before use
)

Critical Parameters

Parameter	Too Low	Too High	Recommended
maxPoolSize	Requests queue, timeouts	Database overloaded	cores * 2 + 1
connectionTimeout	Requests fail fast (maybe too fast)	Requests hang indefinitely	10-30 seconds
maxLifetime	Connections recycled too often	Stale connections accumulate	15-30 minutes
idleTimeout	Pool churns constantly	Abandoned connections waste DB slots	5-10 minutes

Health Checks

Connection Validation

-- Lightweight validation query
SELECT 1;

Configure the pool to validate connections before handing them to application code (testOnBorrow / pool_pre_ping). This catches connections killed by firewalls, database restarts, or network partitions.

Connection Recycling

Connections should not live forever. Firewalls close idle connections silently, databases have connection lifetime limits, and memory leaks accumulate:

maxLifetime: 30 minutes → recycle all connections before database timeout
                          Stagger recycling to avoid mass reconnection

Connection Leak Detection

A connection leak occurs when application code borrows a connection but never returns it:

# LEAK: connection never closed
def get_user(user_id):
    conn = pool.get_connection()
    result = conn.execute("SELECT * FROM users WHERE id = %s", user_id)
    return result  # Connection never returned to pool!

# CORRECT: use context manager
def get_user(user_id):
    with pool.get_connection() as conn:
        result = conn.execute("SELECT * FROM users WHERE id = %s", user_id)
        return result  # Connection automatically returned

Detection Strategies

1. Timeout-based: Alert if connection not returned within 60 seconds
2. Stack trace logging: Log the stack trace of where the connection was borrowed
3. Pool monitoring: Track in-use vs available connections over time
   - Continuously growing "in-use" = leak

Multi-Resource Pool Management

Applications connect to more than just databases:

# Connection pools for different resources
db_pool = create_pool("postgresql://...", max_size=20)
redis_pool = redis.ConnectionPool(max_connections=50)
http_pool = urllib3.HTTPConnectionPool(maxsize=25)

# Total connections per application instance:
#   20 (DB) + 50 (Redis) + 25 (HTTP) = 95 file descriptors
#   Ensure ulimit -n > 95 + headroom

Remember: File Descriptor Limits

Every connection is a file descriptor. Systems default to 1024:

# Check current limit
ulimit -n

# Increase for the process
ulimit -n 65536

Anti-Patterns

Anti-Pattern	Consequence	Fix
Pool size = 100 “for safety”	Database contention, lower throughput	Size based on cores, not fear
No connection validation	Stale connections cause random errors	Enable pre-ping / testOnBorrow
No leak detection	Pool exhaustion over hours	Set leak detection threshold
No maxLifetime	Firewall kills connections silently	Recycle every 15-30 minutes
Connection per request (no pool)	TCP/TLS overhead on every request	Use a connection pool

Connection pooling is one of those infrastructure concerns that is invisible when configured correctly and catastrophic when misconfigured. Get the sizing right, enable health checks, and monitor for leaks.