Database Indexing Deep Dive
Master database indexing. Covers B-tree internals, composite index strategy, covering indexes, partial indexes, index maintenance, and common indexing mistakes that destroy performance.
Bad indexing is the #1 cause of database performance problems. A missing index turns a 2ms query into a 20-second table scan. A wrong index wastes disk space and slows writes for zero benefit. An unused index is pure overhead. This guide covers how indexes actually work and how to choose the right indexing strategy.
How B-Tree Indexes Work
[ 50 ]
/ \
[20, 35] [70, 85]
/ | \ / | \
[10,15] [25,30] [40,45] [60,65] [75,80] [90,95]
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
Data Data Data Data Data Data
To find row with id=65:
50 → go right → 70 → go left → 65 found!
3 page reads instead of scanning all rows
Composite Index Strategy
The leftmost prefix rule determines which queries benefit from a composite index:
-- Index: (customer_id, status, created_at)
-- ✅ Uses index (full match, left-to-right)
SELECT * FROM orders
WHERE customer_id = 123 AND status = 'shipped' AND created_at > '2025-01-01';
-- ✅ Uses index (leftmost prefix)
SELECT * FROM orders WHERE customer_id = 123 AND status = 'shipped';
-- ✅ Uses index (leftmost column)
SELECT * FROM orders WHERE customer_id = 123;
-- ❌ Cannot use index (skips customer_id)
SELECT * FROM orders WHERE status = 'shipped';
-- ❌ Cannot use index (skips customer_id and status)
SELECT * FROM orders WHERE created_at > '2025-01-01';
Rule: Put equality columns first, range columns last.
Index Types
| Index Type | When to Use | Example |
|---|---|---|
| Single column | Filter/join on one column | ON orders(customer_id) |
| Composite | Multi-column filters | ON orders(customer_id, status) |
| Covering | Avoid table lookup | ON orders(customer_id) INCLUDE (total, status) |
| Partial | Filter subset of rows | ON orders(id) WHERE status = 'pending' |
| Expression | Computed values | ON users(LOWER(email)) |
| Unique | Enforce uniqueness | UNIQUE ON users(email) |
Finding Missing/Unused Indexes
-- PostgreSQL: Find slow queries needing indexes
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 100 -- > 100ms average
ORDER BY total_exec_time DESC
LIMIT 20;
-- PostgreSQL: Find unused indexes (wasting disk + RAM)
SELECT schemaname, relname, indexrelname,
idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE idx_scan = 0 -- Never used
AND indexrelid NOT IN ( -- Not a unique constraint
SELECT indexrelid FROM pg_index WHERE indisunique
)
ORDER BY pg_relation_size(indexrelid) DESC;
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Index every column | Write overhead, wasted disk | Index based on actual query patterns |
| No indexes on foreign keys | Slow JOINs and cascading deletes | Always index FK columns |
| Wrong column order in composite | Index unused for common queries | Equality columns first, range last |
| Never analyzing index usage | Unused indexes consuming resources | Monthly review of pg_stat_user_indexes |
| Index on low-cardinality column | B-tree inefficient for 3 distinct values | Consider partial index or no index |
| Missing covering index | Extra table lookup for every row | INCLUDE columns to make index-only scans |
Checklist
- All foreign keys indexed
- Composite indexes ordered: equality first, range last
- Covering indexes for frequent queries (INCLUDE clause)
- Partial indexes for filtered subsets
- Unused indexes identified and dropped
- Missing index detection from slow query log
- Index size monitored (not exceeding table size)
- EXPLAIN ANALYZE before and after index changes
:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database consulting, visit garnetgrid.com. :::