Connection Pooling and Resource Management
Design connection pools that maximize throughput without exhausting database connections, file handles, or memory. Covers pool sizing formulas, health checks, leak detection, and the configuration that prevents your connection pool from becoming your bottleneck.
Every external resource your application connects to — databases, caches, message queues, HTTP APIs — requires a connection. Creating connections is expensive (TCP handshake, TLS negotiation, authentication). Destroying and recreating connections per request wastes time and resources. Connection pooling solves this by maintaining a set of pre-established connections that requests borrow and return.
How Connection Pools Work
Application Thread Pool
Thread 1 ─── borrow ──▶ ┌──────────────┐
Thread 2 ─── borrow ──▶ │ Connection │
Thread 3 ─── waiting │ Pool │ ──▶ Database
Thread 4 ─── borrow ──▶ │ (10 conns) │
Thread 5 ─── waiting └──────────────┘
Pool Lifecycle
1. Application starts → pool creates `minIdle` connections
2. Request arrives → pool provides an idle connection (or creates one if under `maxSize`)
3. Request completes → connection returned to pool (not closed)
4. Pool at max + request arrives → request waits (up to `connectionTimeout`)
5. Connection idle too long → pool closes it (down to `minIdle`)
6. Application shuts down → pool closes all connections
Pool Sizing
The Formula
The optimal pool size is not “as many as possible.” It is:
pool_size = (core_count * 2) + effective_spindle_count
For SSDs with no spindle latency: pool_size ≈ core_count * 2 + 1
An 8-core server needs approximately 17 connections, not 100.
Why Smaller Is Better
Scenario A: 100 connections, 50 active queries
- Database context-switches between 50 transactions
- Each query takes 100ms due to contention
- Throughput: 500 queries/second
Scenario B: 20 connections, 20 active queries
- Database runs 20 queries with minimal contention
- Each query takes 20ms
- Throughput: 1,000 queries/second
More connections does not mean more throughput. It means more contention.
Configuration Parameters
# HikariCP (Java) — Gold standard for connection pools
dataSource:
maximumPoolSize: 20 # Max connections
minimumIdle: 5 # Min idle connections maintained
connectionTimeout: 30000 # Max wait for a connection (ms)
idleTimeout: 600000 # Idle connection lifetime (ms)
maxLifetime: 1800000 # Max connection lifetime (ms)
validationTimeout: 5000 # Health check timeout (ms)
leakDetectionThreshold: 60000 # Log warning if not returned in 60s
# SQLAlchemy (Python)
engine = create_engine(
"postgresql://...",
pool_size=20, # Max connections
max_overflow=5, # Burst beyond pool_size
pool_timeout=30, # Wait time for connection
pool_recycle=1800, # Recycle connections after 30min
pool_pre_ping=True, # Validate before use
)
Critical Parameters
| Parameter | Too Low | Too High | Recommended |
|---|---|---|---|
| maxPoolSize | Requests queue, timeouts | Database overloaded | cores * 2 + 1 |
| connectionTimeout | Requests fail fast (maybe too fast) | Requests hang indefinitely | 10-30 seconds |
| maxLifetime | Connections recycled too often | Stale connections accumulate | 15-30 minutes |
| idleTimeout | Pool churns constantly | Abandoned connections waste DB slots | 5-10 minutes |
Health Checks
Connection Validation
-- Lightweight validation query
SELECT 1;
Configure the pool to validate connections before handing them to application code (testOnBorrow / pool_pre_ping). This catches connections killed by firewalls, database restarts, or network partitions.
Connection Recycling
Connections should not live forever. Firewalls close idle connections silently, databases have connection lifetime limits, and memory leaks accumulate:
maxLifetime: 30 minutes → recycle all connections before database timeout
Stagger recycling to avoid mass reconnection
Connection Leak Detection
A connection leak occurs when application code borrows a connection but never returns it:
# LEAK: connection never closed
def get_user(user_id):
conn = pool.get_connection()
result = conn.execute("SELECT * FROM users WHERE id = %s", user_id)
return result # Connection never returned to pool!
# CORRECT: use context manager
def get_user(user_id):
with pool.get_connection() as conn:
result = conn.execute("SELECT * FROM users WHERE id = %s", user_id)
return result # Connection automatically returned
Detection Strategies
1. Timeout-based: Alert if connection not returned within 60 seconds
2. Stack trace logging: Log the stack trace of where the connection was borrowed
3. Pool monitoring: Track in-use vs available connections over time
- Continuously growing "in-use" = leak
Multi-Resource Pool Management
Applications connect to more than just databases:
# Connection pools for different resources
db_pool = create_pool("postgresql://...", max_size=20)
redis_pool = redis.ConnectionPool(max_connections=50)
http_pool = urllib3.HTTPConnectionPool(maxsize=25)
# Total connections per application instance:
# 20 (DB) + 50 (Redis) + 25 (HTTP) = 95 file descriptors
# Ensure ulimit -n > 95 + headroom
Remember: File Descriptor Limits
Every connection is a file descriptor. Systems default to 1024:
# Check current limit
ulimit -n
# Increase for the process
ulimit -n 65536
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Pool size = 100 “for safety” | Database contention, lower throughput | Size based on cores, not fear |
| No connection validation | Stale connections cause random errors | Enable pre-ping / testOnBorrow |
| No leak detection | Pool exhaustion over hours | Set leak detection threshold |
| No maxLifetime | Firewall kills connections silently | Recycle every 15-30 minutes |
| Connection per request (no pool) | TCP/TLS overhead on every request | Use a connection pool |
Connection pooling is one of those infrastructure concerns that is invisible when configured correctly and catastrophic when misconfigured. Get the sizing right, enable health checks, and monitor for leaks.