Query Performance: Reading the Plan
This chapter teaches you to read EXPLAIN output and fix queries before the database gets angry.
The Planner, Briefly
When you submit a SQL query, PostgreSQL parses it, then a planner picks a strategy. The planner considers scans (Seq, Index, Bitmap), join algorithms (Nested Loop, Hash, Merge), sort methods, and which indexes to use. It estimates the cost of each option and picks the cheapest.
Cost estimates rely on statistics about the tables: row counts, column distributions, most-common values. Statistics come from ANALYZE, which autovacuum runs automatically. When statistics are stale, the planner guesses badly.
Everything below is "how do I tell what the planner chose, and was it a good choice?"
EXPLAIN: What Would Happen
EXPLAIN SELECT * FROM orders WHERE user_id = 1;
Sample output:
Index Scan using orders_user_id_idx on orders (cost=0.15..8.17 rows=3 width=52)
Index Cond: (user_id = 1)
Reading it:
- Index Scan: the chosen access method.
- cost=0.15..8.17: estimated start-up cost and total cost. Arbitrary units; lower is cheaper.
- rows=3: estimated number of rows returned.
- width=52: estimated average row size in bytes.
EXPLAIN is free; it doesn't run the query.
EXPLAIN ANALYZE: What Actually Happened
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 1;
Index Scan using orders_user_id_idx on orders
(cost=0.15..8.17 rows=3 width=52)
(actual time=0.024..0.028 rows=2 loops=1)
Index Cond: (user_id = 1)
Planning Time: 0.157 ms
Execution Time: 0.065 ms
Compare estimated rows with actual rows. If they differ by 10x or more, statistics are stale (run ANALYZE) or the predicate is too complex for the planner to estimate well.
EXPLAIN ANALYZE runs the query. On write queries, it commits changes unless you wrap it in a transaction you roll back:
BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status = 'paid' WHERE id = 1;
ROLLBACK;
EXPLAIN ANALYZE BUFFERS
Add BUFFERS for memory and I/O info:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE user_id = 1;
Index Scan using orders_user_id_idx on orders
Buffers: shared hit=4
shared hit means blocks read from cache. shared read means blocks read from disk. If a "fast" query reports hundreds of reads, your cache isn't warm; the query may be slow under load.
Common Scan Types
Seq Scan: read every row in the table. Fast for small tables; deadly for large ones if you expected an index.
Index Scan: use the index to find matching rows, then fetch each matched row from the heap.
Index Only Scan: answer entirely from the index (all needed columns are in the index). Fast. You may need an INCLUDE clause on the index (Chapter 7).
Bitmap Index Scan + Bitmap Heap Scan: PostgreSQL builds a bitmap of matching pages from the index, then reads the pages. Often chosen when many rows match; better cache behavior than many individual Index Scans.
Join Strategies
Nested Loop: for each row from the outer relation, look up matches in the inner relation (usually via an index). Great for small outer relations.
Hash Join: build a hash table from the smaller side, stream the larger side through it. Great for large joins without useful indexes.
Merge Join: sort both sides on the join key, then stream them together. Great when inputs are already sorted.
The planner picks based on row estimates. Bad estimates lead to bad choices. The symptom: Nested Loop with millions of rows in the outer relation (torturous) or Hash Join that runs out of work_mem (spills to disk).
When Estimates Are Wrong
A few causes, in order of frequency:
- Stale statistics. Run
ANALYZE table_name;. - Correlated columns. The planner assumes columns are independent. If
country = 'US' AND state = 'CA'is much more selective than the planner thinks, considerCREATE STATISTICS(multi-column stats). - Complex expressions.
WHERE length(name) > 10has no stats. Functional indexes don't help stats;CREATE STATISTICSon expressions does. - Expressions over joined columns.
WHERE a.x + b.y = 5is opaque to the planner.
Common Slow-Query Patterns
Missing Index on Foreign Key
The most common performance bug in most codebases. Chapter 7 covered this. To find them:
SELECT conrelid::regclass AS table,
conname AS constraint,
pg_get_constraintdef(oid)
FROM pg_constraint
WHERE contype = 'f'
AND NOT EXISTS (
SELECT 1 FROM pg_index
WHERE indrelid = conrelid
AND (conkey::int[])[1] = (indkey::int[])[1]
);
That lists foreign keys without a matching index on the referencing side.
ORDER BY + LIMIT Without a Matching Index
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20;
Without an index on created_at, this sorts the whole table every time. With CREATE INDEX ON orders (created_at DESC), it reads the first 20 rows of the index and stops.
OR Queries
SELECT * FROM users WHERE email = $1 OR phone = $2;
The planner often can't use indexes on both columns for an OR. Rewrite as UNION:
SELECT * FROM users WHERE email = $1
UNION
SELECT * FROM users WHERE phone = $2;
Both branches use their own index; the results are combined.
Keyset Pagination
LIMIT 100 OFFSET 100000 re-scans 100,000 rows every page. For deep pagination, remember the last value you saw:
SELECT id, created_at, ...
FROM orders
WHERE created_at < $last_seen
ORDER BY created_at DESC
LIMIT 100;
Pass the last row's created_at back in on the next page. Constant time, no matter how deep you scroll.
N+1 Queries
Application code that fetches a list of users, then loops fetching each user's orders. Hundreds of round trips. Fix: one query with a JOIN, or WHERE user_id IN (...) with all the IDs, then group in application code.
Finding Slow Queries
PostgreSQL has pg_stat_statements (enabled as an extension). Once running, it tracks every query by pattern, with totals for calls, total time, and mean time:
CREATE EXTENSION pg_stat_statements; -- once
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
total_exec_time (calls × mean time) is usually what you care about: a 50ms query called a million times a day costs more than a 5-second query called ten times.
Vacuum and Bloat
PostgreSQL's MVCC means UPDATE and DELETE don't remove rows immediately; they mark them dead and let autovacuum clean up. If autovacuum can't keep up (long transactions blocking it, write-heavy tables), tables bloat: the physical file gets much bigger than the live data, and queries get slower as scans cover dead tuples.
Symptoms: a table whose size grows faster than the row count, noticeable slowdown after months of writes.
Inspect:
SELECT relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
Tune autovacuum per-table for hot tables:
ALTER TABLE orders SET (autovacuum_vacuum_scale_factor = 0.05);
Common Pitfalls
Measuring once. Cold caches, cold plans, cold buffers. Run the query 3 to 5 times before trusting a number.
Optimizing a query that runs once a day. Spend budget on hot queries. pg_stat_statements tells you which.
Adding an index to fix a slow query, and not checking what it breaks. Every index slows writes. Confirm the trade-off.
Trusting EXPLAIN over EXPLAIN ANALYZE for correctness. Plan costs are estimates. Actual times are truth.
Next Steps
Continue to 11-nosql-and-alternatives.md to know when SQL isn't the right answer.