Query Performance: Reading the Plan

This chapter teaches you to read EXPLAIN output and fix queries before the database gets angry.

The Planner, Briefly

When you submit a SQL query, PostgreSQL parses it, then a planner picks a strategy. The planner considers scans (Seq, Index, Bitmap), join algorithms (Nested Loop, Hash, Merge), sort methods, and which indexes to use. It estimates the cost of each option and picks the cheapest.

Cost estimates rely on statistics about the tables: row counts, column distributions, most-common values. Statistics come from ANALYZE, which autovacuum runs automatically. When statistics are stale, the planner guesses badly.

Everything below is "how do I tell what the planner chose, and was it a good choice?"

EXPLAIN: What Would Happen

EXPLAIN SELECT * FROM orders WHERE user_id = 1;

Sample output:

 Index Scan using orders_user_id_idx on orders  (cost=0.15..8.17 rows=3 width=52)
   Index Cond: (user_id = 1)

Reading it:

  • Index Scan: the chosen access method.
  • cost=0.15..8.17: estimated start-up cost and total cost. Arbitrary units; lower is cheaper.
  • rows=3: estimated number of rows returned.
  • width=52: estimated average row size in bytes.

EXPLAIN is free; it doesn't run the query.

EXPLAIN ANALYZE: What Actually Happened

EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 1;
 Index Scan using orders_user_id_idx on orders
   (cost=0.15..8.17 rows=3 width=52)
   (actual time=0.024..0.028 rows=2 loops=1)
   Index Cond: (user_id = 1)
 Planning Time: 0.157 ms
 Execution Time: 0.065 ms

Compare estimated rows with actual rows. If they differ by 10x or more, statistics are stale (run ANALYZE) or the predicate is too complex for the planner to estimate well.

EXPLAIN ANALYZE runs the query. On write queries, it commits changes unless you wrap it in a transaction you roll back:

BEGIN;
EXPLAIN ANALYZE UPDATE orders SET status = 'paid' WHERE id = 1;
ROLLBACK;

EXPLAIN ANALYZE BUFFERS

Add BUFFERS for memory and I/O info:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE user_id = 1;
 Index Scan using orders_user_id_idx on orders
   Buffers: shared hit=4

shared hit means blocks read from cache. shared read means blocks read from disk. If a "fast" query reports hundreds of reads, your cache isn't warm; the query may be slow under load.

Common Scan Types

Seq Scan: read every row in the table. Fast for small tables; deadly for large ones if you expected an index.

Index Scan: use the index to find matching rows, then fetch each matched row from the heap.

Index Only Scan: answer entirely from the index (all needed columns are in the index). Fast. You may need an INCLUDE clause on the index (Chapter 7).

Bitmap Index Scan + Bitmap Heap Scan: PostgreSQL builds a bitmap of matching pages from the index, then reads the pages. Often chosen when many rows match; better cache behavior than many individual Index Scans.

Join Strategies

Nested Loop: for each row from the outer relation, look up matches in the inner relation (usually via an index). Great for small outer relations.

Hash Join: build a hash table from the smaller side, stream the larger side through it. Great for large joins without useful indexes.

Merge Join: sort both sides on the join key, then stream them together. Great when inputs are already sorted.

The planner picks based on row estimates. Bad estimates lead to bad choices. The symptom: Nested Loop with millions of rows in the outer relation (torturous) or Hash Join that runs out of work_mem (spills to disk).

When Estimates Are Wrong

A few causes, in order of frequency:

  1. Stale statistics. Run ANALYZE table_name;.
  2. Correlated columns. The planner assumes columns are independent. If country = 'US' AND state = 'CA' is much more selective than the planner thinks, consider CREATE STATISTICS (multi-column stats).
  3. Complex expressions. WHERE length(name) > 10 has no stats. Functional indexes don't help stats; CREATE STATISTICS on expressions does.
  4. Expressions over joined columns. WHERE a.x + b.y = 5 is opaque to the planner.

Common Slow-Query Patterns

Missing Index on Foreign Key

The most common performance bug in most codebases. Chapter 7 covered this. To find them:

SELECT conrelid::regclass AS table,
       conname            AS constraint,
       pg_get_constraintdef(oid)
FROM pg_constraint
WHERE contype = 'f'
  AND NOT EXISTS (
      SELECT 1 FROM pg_index
      WHERE indrelid = conrelid
        AND (conkey::int[])[1] = (indkey::int[])[1]
  );

That lists foreign keys without a matching index on the referencing side.

ORDER BY + LIMIT Without a Matching Index

SELECT * FROM orders ORDER BY created_at DESC LIMIT 20;

Without an index on created_at, this sorts the whole table every time. With CREATE INDEX ON orders (created_at DESC), it reads the first 20 rows of the index and stops.

OR Queries

SELECT * FROM users WHERE email = $1 OR phone = $2;

The planner often can't use indexes on both columns for an OR. Rewrite as UNION:

SELECT * FROM users WHERE email = $1
UNION
SELECT * FROM users WHERE phone = $2;

Both branches use their own index; the results are combined.

Keyset Pagination

LIMIT 100 OFFSET 100000 re-scans 100,000 rows every page. For deep pagination, remember the last value you saw:

SELECT id, created_at, ...
FROM orders
WHERE created_at < $last_seen
ORDER BY created_at DESC
LIMIT 100;

Pass the last row's created_at back in on the next page. Constant time, no matter how deep you scroll.

N+1 Queries

Application code that fetches a list of users, then loops fetching each user's orders. Hundreds of round trips. Fix: one query with a JOIN, or WHERE user_id IN (...) with all the IDs, then group in application code.

Finding Slow Queries

PostgreSQL has pg_stat_statements (enabled as an extension). Once running, it tracks every query by pattern, with totals for calls, total time, and mean time:

CREATE EXTENSION pg_stat_statements;   -- once

SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

total_exec_time (calls × mean time) is usually what you care about: a 50ms query called a million times a day costs more than a 5-second query called ten times.

Vacuum and Bloat

PostgreSQL's MVCC means UPDATE and DELETE don't remove rows immediately; they mark them dead and let autovacuum clean up. If autovacuum can't keep up (long transactions blocking it, write-heavy tables), tables bloat: the physical file gets much bigger than the live data, and queries get slower as scans cover dead tuples.

Symptoms: a table whose size grows faster than the row count, noticeable slowdown after months of writes.

Inspect:

SELECT relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

Tune autovacuum per-table for hot tables:

ALTER TABLE orders SET (autovacuum_vacuum_scale_factor = 0.05);

Common Pitfalls

Measuring once. Cold caches, cold plans, cold buffers. Run the query 3 to 5 times before trusting a number.

Optimizing a query that runs once a day. Spend budget on hot queries. pg_stat_statements tells you which.

Adding an index to fix a slow query, and not checking what it breaks. Every index slows writes. Confirm the trade-off.

Trusting EXPLAIN over EXPLAIN ANALYZE for correctness. Plan costs are estimates. Actual times are truth.

Next Steps

Continue to 11-nosql-and-alternatives.md to know when SQL isn't the right answer.