SQL Performance Tuning: Making Queries Fast Without Rewriting Everything
Diagnose and fix slow SQL queries using systematic analysis. Covers EXPLAIN plans, index design, query anti-patterns, N+1 problems, connection pooling, and the performance investigation workflow that finds the root cause instead of guessing.
The most common response to a slow database is to throw hardware at it. More CPU, more memory, read replicas. This works until it does not, and then you are paying 10x for infrastructure to compensate for queries that should take 5ms but take 5 seconds.
90% of database performance problems are caused by 10% of queries. Finding those queries, understanding why they are slow, and fixing them without breaking the application is both an art and a science. This guide covers the science part.
The Performance Investigation Workflow
1. Identify slow queries
└─ Query log analysis, APM traces, top-queries view
2. Understand the query plan
└─ EXPLAIN ANALYZE
3. Identify the bottleneck
└─ Full table scan? Missing index? Too many rows? Lock contention?
4. Fix the bottleneck
└─ Add index, rewrite query, change access pattern
5. Verify the fix
└─ Compare EXPLAIN output before and after
6. Monitor in production
└─ Track query latency after deployment
Reading EXPLAIN Plans
The EXPLAIN command is your primary diagnostic tool. It tells you exactly how the database plans to execute your query.
-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 20;
Key Indicators in the Plan
| Indicator | Good | Bad |
|---|---|---|
| Scan type | Index Scan, Index Only Scan | Seq Scan (on large tables) |
| Rows estimated vs actual | Close match (within 10x) | Wildly different (stale statistics) |
| Buffers | Mostly hits (cached) | Mostly reads (I/O bound) |
| Sort method | In-memory (top-N heapsort) | External merge (disk sort) |
| Join type | Nested Loop (small set), Hash Join (medium) | Nested Loop on large unindexed set |
Common plan nodes and what they mean:
Seq Scan → Full table scan (bad on large tables, fine on small ones)
Index Scan → Using an index to find rows (good)
Index Only Scan → All data from the index itself (best)
Bitmap Index Scan → Using index to build a bitmap, then scan heap (medium)
Hash Join → Building hash table for join (good for large-large joins)
Nested Loop → For each row in outer, scan inner (good for small-large)
Sort → Sorting result set (watch for external/disk sort)
Index Design
The Right Index for the Query
-- Query: Find recent pending orders for a customer
SELECT * FROM orders
WHERE customer_id = 123
AND status = 'pending'
AND created_at > '2024-07-01'
ORDER BY created_at DESC;
-- Best index: compound index matching WHERE + ORDER BY
CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);
-- Column order matters: most selective first
Index Selection Rules
| Scenario | Index Type |
|---|---|
Equality filter (WHERE x = ?) | B-tree (default) |
Range filter (WHERE x > ?) | B-tree, range column last in compound index |
Text search (WHERE text LIKE '%query%') | Full-text index (GIN/GiST), NOT B-tree |
JSON querying (WHERE data->>'key' = ?) | GIN index on JSON column |
Geospatial (WHERE ST_Within(...)) | GiST (PostGIS) |
Low-cardinality filter (WHERE status IN (...)) | Partial index or composite with high-cardinality column |
Over-Indexing vs Under-Indexing
| Too Few Indexes | Too Many Indexes |
|---|---|
| Slow reads, full table scans | Slow writes (every INSERT/UPDATE maintains all indexes) |
| Obvious: queries are slow | Hidden: write latency slowly increases |
| Fix: EXPLAIN → add targeted indexes | Fix: audit unused indexes, drop them |
-- Find unused indexes in PostgreSQL
SELECT
schemaname || '.' || relname AS table,
indexrelname AS index,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(i.indexrelid)) AS size
FROM pg_stat_user_indexes i
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0
AND NOT indisunique -- Keep unique constraint indexes
ORDER BY pg_relation_size(i.indexrelid) DESC;
Query Anti-Patterns
The N+1 Problem
# ❌ N+1 queries (1 query for orders + N queries for customers)
orders = db.query("SELECT * FROM orders WHERE status = 'pending'")
for order in orders:
# This executes once PER ORDER — if 1000 orders, 1000 queries
customer = db.query("SELECT * FROM customers WHERE id = ?", order.customer_id)
# ✅ Fixed: JOIN or batch fetch
orders = db.query("""
SELECT o.*, c.name as customer_name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
""")
# ✅ Alternative: batch fetch with IN clause
customer_ids = [o.customer_id for o in orders]
customers = db.query("SELECT * FROM customers WHERE id = ANY(?)", customer_ids)
Other Anti-Patterns
| Anti-Pattern | Why It Is Slow | Fix |
|---|---|---|
SELECT * | Fetches all columns including large TEXT/BLOB | Select only needed columns |
| Functions on indexed columns | WHERE UPPER(email) = ? cannot use index on email | Functional index or normalize data |
| OR conditions | WHERE a = 1 OR b = 2 — cannot use single index efficiently | Use UNION or separate queries |
| Correlated subqueries | Subquery runs once per row in outer query | Rewrite as JOIN |
| Missing LIMIT | Returns 100K rows when you need 20 | Always LIMIT unless you need everything |
| OFFSET pagination | OFFSET 100000 scans and discards 100K rows | Use keyset pagination (WHERE id > last_seen_id) |
Keyset Pagination vs OFFSET
-- ❌ OFFSET: gets slower as you paginate deeper
SELECT * FROM orders ORDER BY created_at DESC OFFSET 10000 LIMIT 20;
-- Scans 10,020 rows, returns 20. At page 500, scans ~10,000 rows.
-- ✅ Keyset: constant speed regardless of page depth
SELECT * FROM orders
WHERE created_at < '2024-07-15T10:30:00Z' -- Last item from previous page
ORDER BY created_at DESC
LIMIT 20;
-- Always scans ~20 rows via index. Same speed at page 1 and page 500.
Connection Pooling
Every database connection consumes memory (5-10MB on PostgreSQL). Opening connections is slow (TCP handshake + authentication). Without pooling, each web request opens a new connection, uses it for 5ms, and closes it.
Without pooling (100 concurrent requests):
100 connections opened → 100 × 10MB = 1GB RAM on DB server
Each connection: 50ms to establish, 5ms to query, close
Database max_connections hit at scale → connections refused
With pooling (100 concurrent requests, pool size 20):
20 persistent connections shared across requests
Each request: borrow connection (0ms), query (5ms), return
Database uses 200MB instead of 1GB
Scales to thousands of concurrent requests
| Pool Setting | Recommended | Why |
|---|---|---|
| Pool size | 2-5 per CPU core | Too high wastes DB memory, too low causes queueing |
| Max overflow | 2× pool size | Handles burst traffic |
| Connection timeout | 5 seconds | Fail fast rather than queue indefinitely |
| Idle timeout | 300 seconds | Reclaim unused connections |
| Max lifetime | 1800 seconds | Prevent stale connections |
Implementation Checklist
- Identify your top 10 slowest queries (from slow query log or APM)
- Run EXPLAIN ANALYZE on each slow query — understand the plan before fixing
- Add indexes for the top bottlenecks (compound indexes matching WHERE + ORDER BY)
- Fix N+1 queries: use JOINs or batch fetches instead of per-row queries
- Replace OFFSET pagination with keyset pagination for paginated endpoints
- Select only needed columns — never
SELECT *in production code - Configure connection pooling: pool size = 2-5 × CPU cores
- Set up slow query logging: alert on queries > 1 second
- Audit unused indexes quarterly and drop them (they slow down writes)
- Update table statistics regularly (
ANALYZEin PostgreSQL) to keep query plans accurate