Database Indexing Deep Dive

Bad indexing is the #1 cause of database performance problems. A missing index turns a 2ms query into a 20-second table scan. A wrong index wastes disk space and slows writes for zero benefit. An unused index is pure overhead. This guide covers how indexes actually work and how to choose the right indexing strategy.

How B-Tree Indexes Work

                    [  50  ]
                   /        \
           [20, 35]          [70, 85]
          /    |    \       /    |    \
     [10,15] [25,30] [40,45] [60,65] [75,80] [90,95]
      ↓  ↓    ↓  ↓    ↓  ↓    ↓  ↓    ↓  ↓    ↓  ↓
     Data     Data     Data   Data    Data    Data

To find row with id=65:
  50 → go right → 70 → go left → 65 found!
  3 page reads instead of scanning all rows

Composite Index Strategy

The leftmost prefix rule determines which queries benefit from a composite index:

-- Index: (customer_id, status, created_at)

-- ✅ Uses index (full match, left-to-right)
SELECT * FROM orders 
WHERE customer_id = 123 AND status = 'shipped' AND created_at > '2025-01-01';

-- ✅ Uses index (leftmost prefix)
SELECT * FROM orders WHERE customer_id = 123 AND status = 'shipped';

-- ✅ Uses index (leftmost column)
SELECT * FROM orders WHERE customer_id = 123;

-- ❌ Cannot use index (skips customer_id)
SELECT * FROM orders WHERE status = 'shipped';

-- ❌ Cannot use index (skips customer_id and status)
SELECT * FROM orders WHERE created_at > '2025-01-01';

Rule: Put equality columns first, range columns last.

Index Types

Index Type	When to Use	Example
Single column	Filter/join on one column	`ON orders(customer_id)`
Composite	Multi-column filters	`ON orders(customer_id, status)`
Covering	Avoid table lookup	`ON orders(customer_id) INCLUDE (total, status)`
Partial	Filter subset of rows	`ON orders(id) WHERE status = 'pending'`
Expression	Computed values	`ON users(LOWER(email))`
Unique	Enforce uniqueness	`UNIQUE ON users(email)`

Finding Missing/Unused Indexes

-- PostgreSQL: Find slow queries needing indexes
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 100  -- > 100ms average
ORDER BY total_exec_time DESC
LIMIT 20;

-- PostgreSQL: Find unused indexes (wasting disk + RAM)
SELECT schemaname, relname, indexrelname, 
       idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE idx_scan = 0           -- Never used
  AND indexrelid NOT IN (     -- Not a unique constraint
    SELECT indexrelid FROM pg_index WHERE indisunique
  )
ORDER BY pg_relation_size(indexrelid) DESC;

Anti-Patterns

Anti-Pattern	Problem	Fix
Index every column	Write overhead, wasted disk	Index based on actual query patterns
No indexes on foreign keys	Slow JOINs and cascading deletes	Always index FK columns
Wrong column order in composite	Index unused for common queries	Equality columns first, range last
Never analyzing index usage	Unused indexes consuming resources	Monthly review of `pg_stat_user_indexes`
Index on low-cardinality column	B-tree inefficient for 3 distinct values	Consider partial index or no index
Missing covering index	Extra table lookup for every row	INCLUDE columns to make index-only scans

Checklist

All foreign keys indexed
Composite indexes ordered: equality first, range last
Covering indexes for frequent queries (INCLUDE clause)
Partial indexes for filtered subsets
Unused indexes identified and dropped
Missing index detection from slow query log
Index size monitored (not exceeding table size)
EXPLAIN ANALYZE before and after index changes

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database consulting, visit garnetgrid.com. :::