PostgreSQL Query Optimization
Optimize PostgreSQL queries from slow scans to blazing-fast indexed lookups. Covers EXPLAIN ANALYZE, index strategies, query rewriting, join optimization, partial indexes, covering indexes, and the patterns that make PostgreSQL queries fast at scale.
A single slow query can take down a production database. PostgreSQL processes millions of rows efficiently — if you help it. Query optimization is about understanding how PostgreSQL executes queries and giving the planner the information it needs to choose the fastest path.
EXPLAIN ANALYZE
-- Always start here: see what PostgreSQL actually does
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 50;
-- What to look for:
-- Seq Scan on orders → Missing index, scanning entire table
-- Sort → Consider index for ORDER BY
-- Hash Join → Large join, may need index on join key
-- actual rows=1000 → 1000 rows scanned for 50 results
-- Buffers: shared hit → Data was in cache (good)
-- Buffers: shared read → Data loaded from disk (slow)
Index Strategies
-- B-tree (default): Equality and range queries
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_orders_created_at ON orders(created_at);
-- Composite index: Multiple columns (order matters!)
CREATE INDEX idx_orders_status_created
ON orders(status, created_at DESC);
-- Covers: WHERE status = 'pending' ORDER BY created_at DESC
-- Partial index: Index subset of rows
CREATE INDEX idx_orders_pending
ON orders(created_at DESC)
WHERE status = 'pending';
-- Smaller index, faster for common queries
-- Covering index (INCLUDE): Avoids table lookup
CREATE INDEX idx_orders_pending_covering
ON orders(status, created_at DESC)
INCLUDE (total, customer_id);
-- Index-only scan: all data in index, no heap access
-- GIN index: JSONB, arrays, full-text
CREATE INDEX idx_orders_metadata ON orders USING gin(metadata);
-- Supports: metadata @> '{"priority": "high"}'
-- BRIN index: Large tables with naturally ordered data
CREATE INDEX idx_events_timestamp ON events USING brin(timestamp);
-- Tiny index for time-series data
Common Query Rewrites
-- BAD: Function on indexed column (index not used)
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2026;
-- GOOD: Range query (index used)
SELECT * FROM orders
WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01';
-- BAD: OR on different columns (index not used efficiently)
SELECT * FROM orders WHERE status = 'pending' OR priority = 'high';
-- GOOD: UNION of indexed queries
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE priority = 'high' AND status != 'pending';
-- BAD: SELECT * when you need 2 columns
SELECT * FROM orders WHERE status = 'pending';
-- GOOD: Select only needed columns (enables index-only scan)
SELECT id, total FROM orders WHERE status = 'pending';
-- BAD: COUNT(*) on large table
SELECT COUNT(*) FROM orders; -- Seq scan, slow
-- GOOD: Approximate count
SELECT reltuples::bigint FROM pg_class WHERE relname = 'orders';
Join Optimization
-- Ensure join keys are indexed
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
-- For many-to-many, index both sides
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);
-- Lateral join for "top-N per group"
-- Find latest 3 orders per customer (efficient)
SELECT c.id, c.name, latest_orders.*
FROM customers c
CROSS JOIN LATERAL (
SELECT o.id, o.total, o.created_at
FROM orders o
WHERE o.customer_id = c.id
ORDER BY o.created_at DESC
LIMIT 3
) latest_orders;
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| No EXPLAIN ANALYZE | Guessing at performance | Always EXPLAIN before optimizing |
| Index everything | Write overhead, storage waste | Index only queried columns |
| SELECT * | Prevents index-only scans | Select needed columns only |
| Functions on WHERE columns | Index bypassed | Rewrite as range/equality |
| Missing composite indexes | Multiple single-column scans | Composite index matching query pattern |
PostgreSQL is remarkably fast when you work with it. The optimizer is sophisticated — but it needs indexes, statistics, and well-structured queries to do its best work.