ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

SQL Performance Tuning: Making Queries Fast Without Rewriting Everything

Diagnose and fix slow SQL queries using systematic analysis. Covers EXPLAIN plans, index design, query anti-patterns, N+1 problems, connection pooling, and the performance investigation workflow that finds the root cause instead of guessing.

The most common response to a slow database is to throw hardware at it. More CPU, more memory, read replicas. This works until it does not, and then you are paying 10x for infrastructure to compensate for queries that should take 5ms but take 5 seconds.

90% of database performance problems are caused by 10% of queries. Finding those queries, understanding why they are slow, and fixing them without breaking the application is both an art and a science. This guide covers the science part.


The Performance Investigation Workflow

1. Identify slow queries
   └─ Query log analysis, APM traces, top-queries view

2. Understand the query plan
   └─ EXPLAIN ANALYZE

3. Identify the bottleneck
   └─ Full table scan? Missing index? Too many rows? Lock contention?

4. Fix the bottleneck
   └─ Add index, rewrite query, change access pattern

5. Verify the fix
   └─ Compare EXPLAIN output before and after

6. Monitor in production
   └─ Track query latency after deployment

Reading EXPLAIN Plans

The EXPLAIN command is your primary diagnostic tool. It tells you exactly how the database plans to execute your query.

-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 20;

Key Indicators in the Plan

IndicatorGoodBad
Scan typeIndex Scan, Index Only ScanSeq Scan (on large tables)
Rows estimated vs actualClose match (within 10x)Wildly different (stale statistics)
BuffersMostly hits (cached)Mostly reads (I/O bound)
Sort methodIn-memory (top-N heapsort)External merge (disk sort)
Join typeNested Loop (small set), Hash Join (medium)Nested Loop on large unindexed set
Common plan nodes and what they mean:

Seq Scan          → Full table scan (bad on large tables, fine on small ones)
Index Scan        → Using an index to find rows (good)
Index Only Scan   → All data from the index itself (best)
Bitmap Index Scan → Using index to build a bitmap, then scan heap (medium)
Hash Join         → Building hash table for join (good for large-large joins)
Nested Loop       → For each row in outer, scan inner (good for small-large)
Sort              → Sorting result set (watch for external/disk sort)

Index Design

The Right Index for the Query

-- Query: Find recent pending orders for a customer
SELECT * FROM orders
WHERE customer_id = 123
  AND status = 'pending'
  AND created_at > '2024-07-01'
ORDER BY created_at DESC;

-- Best index: compound index matching WHERE + ORDER BY
CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);
-- Column order matters: most selective first

Index Selection Rules

ScenarioIndex Type
Equality filter (WHERE x = ?)B-tree (default)
Range filter (WHERE x > ?)B-tree, range column last in compound index
Text search (WHERE text LIKE '%query%')Full-text index (GIN/GiST), NOT B-tree
JSON querying (WHERE data->>'key' = ?)GIN index on JSON column
Geospatial (WHERE ST_Within(...))GiST (PostGIS)
Low-cardinality filter (WHERE status IN (...))Partial index or composite with high-cardinality column

Over-Indexing vs Under-Indexing

Too Few IndexesToo Many Indexes
Slow reads, full table scansSlow writes (every INSERT/UPDATE maintains all indexes)
Obvious: queries are slowHidden: write latency slowly increases
Fix: EXPLAIN → add targeted indexesFix: audit unused indexes, drop them
-- Find unused indexes in PostgreSQL
SELECT
    schemaname || '.' || relname AS table,
    indexrelname AS index,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(i.indexrelid)) AS size
FROM pg_stat_user_indexes i
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0
  AND NOT indisunique      -- Keep unique constraint indexes
ORDER BY pg_relation_size(i.indexrelid) DESC;

Query Anti-Patterns

The N+1 Problem

# ❌ N+1 queries (1 query for orders + N queries for customers)
orders = db.query("SELECT * FROM orders WHERE status = 'pending'")
for order in orders:
    # This executes once PER ORDER — if 1000 orders, 1000 queries
    customer = db.query("SELECT * FROM customers WHERE id = ?", order.customer_id)

# ✅ Fixed: JOIN or batch fetch
orders = db.query("""
    SELECT o.*, c.name as customer_name
    FROM orders o
    JOIN customers c ON c.id = o.customer_id
    WHERE o.status = 'pending'
""")

# ✅ Alternative: batch fetch with IN clause
customer_ids = [o.customer_id for o in orders]
customers = db.query("SELECT * FROM customers WHERE id = ANY(?)", customer_ids)

Other Anti-Patterns

Anti-PatternWhy It Is SlowFix
SELECT *Fetches all columns including large TEXT/BLOBSelect only needed columns
Functions on indexed columnsWHERE UPPER(email) = ? cannot use index on emailFunctional index or normalize data
OR conditionsWHERE a = 1 OR b = 2 — cannot use single index efficientlyUse UNION or separate queries
Correlated subqueriesSubquery runs once per row in outer queryRewrite as JOIN
Missing LIMITReturns 100K rows when you need 20Always LIMIT unless you need everything
OFFSET paginationOFFSET 100000 scans and discards 100K rowsUse keyset pagination (WHERE id > last_seen_id)

Keyset Pagination vs OFFSET

-- ❌ OFFSET: gets slower as you paginate deeper
SELECT * FROM orders ORDER BY created_at DESC OFFSET 10000 LIMIT 20;
-- Scans 10,020 rows, returns 20. At page 500, scans ~10,000 rows.

-- ✅ Keyset: constant speed regardless of page depth
SELECT * FROM orders
WHERE created_at < '2024-07-15T10:30:00Z'  -- Last item from previous page
ORDER BY created_at DESC
LIMIT 20;
-- Always scans ~20 rows via index. Same speed at page 1 and page 500.

Connection Pooling

Every database connection consumes memory (5-10MB on PostgreSQL). Opening connections is slow (TCP handshake + authentication). Without pooling, each web request opens a new connection, uses it for 5ms, and closes it.

Without pooling (100 concurrent requests):
  100 connections opened → 100 × 10MB = 1GB RAM on DB server
  Each connection: 50ms to establish, 5ms to query, close
  Database max_connections hit at scale → connections refused

With pooling (100 concurrent requests, pool size 20):
  20 persistent connections shared across requests
  Each request: borrow connection (0ms), query (5ms), return
  Database uses 200MB instead of 1GB
  Scales to thousands of concurrent requests
Pool SettingRecommendedWhy
Pool size2-5 per CPU coreToo high wastes DB memory, too low causes queueing
Max overflow2× pool sizeHandles burst traffic
Connection timeout5 secondsFail fast rather than queue indefinitely
Idle timeout300 secondsReclaim unused connections
Max lifetime1800 secondsPrevent stale connections

Implementation Checklist

  • Identify your top 10 slowest queries (from slow query log or APM)
  • Run EXPLAIN ANALYZE on each slow query — understand the plan before fixing
  • Add indexes for the top bottlenecks (compound indexes matching WHERE + ORDER BY)
  • Fix N+1 queries: use JOINs or batch fetches instead of per-row queries
  • Replace OFFSET pagination with keyset pagination for paginated endpoints
  • Select only needed columns — never SELECT * in production code
  • Configure connection pooling: pool size = 2-5 × CPU cores
  • Set up slow query logging: alert on queries > 1 second
  • Audit unused indexes quarterly and drop them (they slow down writes)
  • Update table statistics regularly (ANALYZE in PostgreSQL) to keep query plans accurate
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →