SQL Performance Tuning: Making Queries Fast Without Rewriting Everything

The most common response to a slow database is to throw hardware at it. More CPU, more memory, read replicas. This works until it does not, and then you are paying 10x for infrastructure to compensate for queries that should take 5ms but take 5 seconds.

90% of database performance problems are caused by 10% of queries. Finding those queries, understanding why they are slow, and fixing them without breaking the application is both an art and a science. This guide covers the science part.

The Performance Investigation Workflow

1. Identify slow queries
   └─ Query log analysis, APM traces, top-queries view

2. Understand the query plan
   └─ EXPLAIN ANALYZE

3. Identify the bottleneck
   └─ Full table scan? Missing index? Too many rows? Lock contention?

4. Fix the bottleneck
   └─ Add index, rewrite query, change access pattern

5. Verify the fix
   └─ Compare EXPLAIN output before and after

6. Monitor in production
   └─ Track query latency after deployment

Reading EXPLAIN Plans

The EXPLAIN command is your primary diagnostic tool. It tells you exactly how the database plans to execute your query.

-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 20;

Key Indicators in the Plan

Indicator	Good	Bad
Scan type	Index Scan, Index Only Scan	Seq Scan (on large tables)
Rows estimated vs actual	Close match (within 10x)	Wildly different (stale statistics)
Buffers	Mostly hits (cached)	Mostly reads (I/O bound)
Sort method	In-memory (top-N heapsort)	External merge (disk sort)
Join type	Nested Loop (small set), Hash Join (medium)	Nested Loop on large unindexed set

Common plan nodes and what they mean:

Seq Scan          → Full table scan (bad on large tables, fine on small ones)
Index Scan        → Using an index to find rows (good)
Index Only Scan   → All data from the index itself (best)
Bitmap Index Scan → Using index to build a bitmap, then scan heap (medium)
Hash Join         → Building hash table for join (good for large-large joins)
Nested Loop       → For each row in outer, scan inner (good for small-large)
Sort              → Sorting result set (watch for external/disk sort)

Index Design

The Right Index for the Query

-- Query: Find recent pending orders for a customer
SELECT * FROM orders
WHERE customer_id = 123
  AND status = 'pending'
  AND created_at > '2024-07-01'
ORDER BY created_at DESC;

-- Best index: compound index matching WHERE + ORDER BY
CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);
-- Column order matters: most selective first

Index Selection Rules

Scenario	Index Type
Equality filter (`WHERE x = ?`)	B-tree (default)
Range filter (`WHERE x > ?`)	B-tree, range column last in compound index
Text search (`WHERE text LIKE '%query%'`)	Full-text index (GIN/GiST), NOT B-tree
JSON querying (`WHERE data->>'key' = ?`)	GIN index on JSON column
Geospatial (`WHERE ST_Within(...)`)	GiST (PostGIS)
Low-cardinality filter (`WHERE status IN (...)`)	Partial index or composite with high-cardinality column

Over-Indexing vs Under-Indexing

Too Few Indexes	Too Many Indexes
Slow reads, full table scans	Slow writes (every INSERT/UPDATE maintains all indexes)
Obvious: queries are slow	Hidden: write latency slowly increases
Fix: EXPLAIN → add targeted indexes	Fix: audit unused indexes, drop them

-- Find unused indexes in PostgreSQL
SELECT
    schemaname || '.' || relname AS table,
    indexrelname AS index,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(i.indexrelid)) AS size
FROM pg_stat_user_indexes i
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0
  AND NOT indisunique      -- Keep unique constraint indexes
ORDER BY pg_relation_size(i.indexrelid) DESC;

Query Anti-Patterns

The N+1 Problem

# ❌ N+1 queries (1 query for orders + N queries for customers)
orders = db.query("SELECT * FROM orders WHERE status = 'pending'")
for order in orders:
    # This executes once PER ORDER — if 1000 orders, 1000 queries
    customer = db.query("SELECT * FROM customers WHERE id = ?", order.customer_id)

# ✅ Fixed: JOIN or batch fetch
orders = db.query("""
    SELECT o.*, c.name as customer_name
    FROM orders o
    JOIN customers c ON c.id = o.customer_id
    WHERE o.status = 'pending'
""")

# ✅ Alternative: batch fetch with IN clause
customer_ids = [o.customer_id for o in orders]
customers = db.query("SELECT * FROM customers WHERE id = ANY(?)", customer_ids)

Other Anti-Patterns

Anti-Pattern	Why It Is Slow	Fix
`SELECT *`	Fetches all columns including large TEXT/BLOB	Select only needed columns
Functions on indexed columns	`WHERE UPPER(email) = ?` cannot use index on `email`	Functional index or normalize data
OR conditions	`WHERE a = 1 OR b = 2` — cannot use single index efficiently	Use UNION or separate queries
Correlated subqueries	Subquery runs once per row in outer query	Rewrite as JOIN
Missing LIMIT	Returns 100K rows when you need 20	Always LIMIT unless you need everything
OFFSET pagination	`OFFSET 100000` scans and discards 100K rows	Use keyset pagination (WHERE id > last_seen_id)

Keyset Pagination vs OFFSET

-- ❌ OFFSET: gets slower as you paginate deeper
SELECT * FROM orders ORDER BY created_at DESC OFFSET 10000 LIMIT 20;
-- Scans 10,020 rows, returns 20. At page 500, scans ~10,000 rows.

-- ✅ Keyset: constant speed regardless of page depth
SELECT * FROM orders
WHERE created_at < '2024-07-15T10:30:00Z'  -- Last item from previous page
ORDER BY created_at DESC
LIMIT 20;
-- Always scans ~20 rows via index. Same speed at page 1 and page 500.

Connection Pooling

Every database connection consumes memory (5-10MB on PostgreSQL). Opening connections is slow (TCP handshake + authentication). Without pooling, each web request opens a new connection, uses it for 5ms, and closes it.

Without pooling (100 concurrent requests):
  100 connections opened → 100 × 10MB = 1GB RAM on DB server
  Each connection: 50ms to establish, 5ms to query, close
  Database max_connections hit at scale → connections refused

With pooling (100 concurrent requests, pool size 20):
  20 persistent connections shared across requests
  Each request: borrow connection (0ms), query (5ms), return
  Database uses 200MB instead of 1GB
  Scales to thousands of concurrent requests

Pool Setting	Recommended	Why
Pool size	2-5 per CPU core	Too high wastes DB memory, too low causes queueing
Max overflow	2× pool size	Handles burst traffic
Connection timeout	5 seconds	Fail fast rather than queue indefinitely
Idle timeout	300 seconds	Reclaim unused connections
Max lifetime	1800 seconds	Prevent stale connections