Your query runs in 4,200ms. With one CREATE INDEX and a rewrite, it runs in 0.8ms. EXPLAIN ANALYZE tells you exactly why. This isn't magic; it's forensic engineering. PostgreSQL will happily perform a sequential scan on a 10-million-row table if you ask it the wrong way, grinding your application to a halt while your CPU yawns. The planner isn't stupid—it's working with the statistics and indexes you give it. Your job is to stop writing queries that fight the database and start writing ones that cooperate.
Reading EXPLAIN ANALYZE: From Seq Scan to Victory Lap
The difference between guessing and knowing is two words: EXPLAIN ANALYZE. Run your suspect query with it in psql, and PostgreSQL executes the query and returns the planner's blueprint, annotated with cold, hard timings.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 1234 AND status = 'processed';
The output is a tree. You read it from the innermost indent outward. Here’s what to hunt for:
Seq Scan on orders (cost=0.00..158456.23 rows=1 width=145) (actual time=2.142..4200.117 rows=1 loops=1) This is your smoking gun. A Sequential Scan reads every single row in the table. The
costis an estimate; theactual timeis reality. 4200ms to find one row is a disaster. It means there's no usable index, or the planner thinks the index is too expensive.Index Scan using idx_orders_user_id on orders (cost=0.43..8.45 rows=1 width=145) (actual time=0.023..0.024 rows=1 loops=1) This is what you want. The planner used an index (
idx_orders_user_id) to find rows directly. The jump from 4200ms to 0.024ms is the entire point of this article.Hash Join (cost=...) vs Nested Loop (cost=...) For joins,
Hash Joinbuilds a hash table in memory from the smaller table—great for larger, non-indexed joins.Nested Loopiterates through rows in a loop—fantastic when driving via an index on a small subset. If you see aNested Loopwith a massive innerSeq Scan, you've found an N+1 query hiding in a join.
The key metric is actual time. Ignore the estimated cost when you have real execution data. If the first number after actual time is high (e.g., actual time=100.0..), startup is slow, often due to sorting or computing a filter. If the second number is massive (e.g., ..4200.117), the entire operation is slow, usually a full table scan.
When Postgres Ignores Your Perfectly Good Index
You ran CREATE INDEX. The \d command in psql shows it. Yet EXPLAIN still shows a Seq Scan. Why?
Stale Statistics: The planner uses statistics about your data distribution to decide. If you just inserted 1 million rows, the stats might think your table has 100 rows. It will choose a Seq Scan every time.
- Fix: Run
ANALYZE orders;. Do this after large data changes. Autovacuum does this automatically, but it can lag.
- Fix: Run
The Query Can't Use the Index: The most common culprit. Your index is on
(user_id), but your query filters onWHERE lower(email) = 'foo@bar.com'. Thelower()function renders the index useless.- Fix: Create a functional index:
CREATE INDEX idx_email_lower ON users(lower(email));.
- Fix: Create a functional index:
The Planner Thinks a Seq Scan is Cheaper: On small tables, or when your query selects > ~5-10% of the table, reading sequentially from disk (which is optimized for this) can be faster than hopping around via random I/O to an index.
- Fix: This is often correct. If it's not, try increasing the
random_page_costsetting (default 4.0) if your data is mostly in RAM, or use an index-only scan.
- Fix: This is often correct. If it's not, try increasing the
The Index is Bloated: After many
UPDATEs andDELETEs, indexes can become inefficient, full of dead tuples.- Fix:
REINDEX INDEX concurrently idx_orders_user_id;. UseCONCURRENTLYto avoid locking the table in production.
- Fix:
Real Error & Fix:
ERROR: operator does not exist: integer = text Fix: Add an explicit cast (
WHERE user_id::text = '1234') or, better, fix your application logic to use correct types. For fuzzy text matching, enable thepg_trgmextension and create a GIN index.
B-tree vs GIN vs GiST vs BRIN: Picking Your Weapon
Throwing a CREATE INDEX on every column is a recipe for slow writes and storage bloat. Choose wisely.
| Index Type | Best For | Example Use Case | Performance Gain (from Benchmarks) |
|---|---|---|---|
| B-tree | Default. Equality & range queries on scalar data. | WHERE id = 5, WHERE created_at > '2024-01-01'. | 0.8ms vs 4200ms on 10M rows with a selective WHERE. |
| GIN | Composite values: Arrays, JSONB, Full-Text Search. | WHERE tags @> ARRAY['postgres'], WHERE doc @> '{"status":"active"}'. | 2ms vs 8500ms on 1M JSONB docs using @> operator. |
| GiST | Geometric data, full-text search, nearest-neighbor. | WHERE location <-> point '(0,0)' < 10, text search ranking. | Varies. The tool for spatial and complex matches. |
| BRIN | Very large tables with naturally sorted data (time). | WHERE created_at BETWEEN '2024-06-01' AND '2024-06-30' on a 10TB log table. | Minimal storage, fast for time-range scans on massive data. |
Rule of thumb: Start with B-tree. Use GIN for JSONB or array columns you query into. Use BRIN for append-only time-series data (consider TimescaleDB, which automates this). Use GiST for specialized spatial or text relevance searches.
Partial Indexes: Cutting Your Index Size by 90%
Why index data you never query? A partial index indexes only a subset of rows defined by a WHERE clause.
-- Instead of indexing all orders:
CREATE INDEX idx_orders_status ON orders(status);
-- Index only the active ones (5% of your table):
CREATE INDEX idx_orders_active ON orders(status) WHERE status = 'active';
This index is tiny, fast to update, and fits in RAM. Queries with WHERE status = 'active' will use it. Queries for status = 'archived' won't—and that's fine, because you rarely run those. This is the single most effective trick for optimizing high-write tables with low-cardinality status fields.
Finding and Slaying N+1 Queries with pg_stat_statements
The N+1 problem isn't unique to ORMs. It's any pattern where you execute many small queries instead of one bulk query. In production, you find it with pg_stat_statements, the built-in query fingerprint tracker.
First, enable it in postgresql.conf:
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
Then, in psql, find the worst offenders:
SELECT query, calls, total_exec_time, mean_exec_time,
rows / calls as avg_rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Look for a query with a high calls count and a low avg_rows (often 1). That's your N+1. The fix is almost always to rewrite using a JOIN or WHERE ... IN (...) clause to fetch data in a single round trip.
Real Error & Fix:
FATAL: remaining connection slots are reserved for non-replication superuser connections Fix: This is a connection pool exhaustion. Short-term: Increase
max_connectionsinpostgresql.conf(dangerous—each connection has overhead). Real fix: Implement PgBouncer intransactionpooling mode. This pools connections at the transaction level, allowing 200 app connections to share maybe 20-30 actual PostgreSQL connections. The benchmark doesn't lie: 50,000 req/s vs 8,000 without pooling.
JSONB Optimization: Making Document Queries Scream
JSONB is fantastic, but querying it without an index is a Seq Scan through every document. The @> (contains) and ? (exists) operators are your workhorses, and they require a GIN index.
-- Create the index
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
-- Now these queries are fast:
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';
SELECT * FROM products WHERE metadata ? 'in_stock';
Without the GIN index, these queries on 1 million documents take 8500ms. With it, they take ~2ms. The GIN index essentially creates a searchable map of every key and value in your JSONB column.
Autovacuum Tuning: Stopping the Slow Bleed
Autovacuum is not optional. It cleans up dead rows from UPDATEs and DELETEs and runs ANALYZE to update statistics. If it falls behind, you get table and index bloat (wasted space, slow scans) and stale stats (bad planner decisions).
Monitor bloat. If you see queries slowing down over time despite indexes, check if autovacuum is keeping up:
SELECT schemaname, relname, n_live_tup, n_dead_tup,
(n_dead_tup / GREATEST(n_live_tup + n_dead_tup, 1))::numeric(5,2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY dead_ratio DESC;
If dead_ratio is consistently high (> 0.2), tune autovacuum for that table:
-- Be more aggressive on a high-churn table
ALTER TABLE orders SET (autovacuum_vacuum_scale_factor = 0.05); -- Was 0.2
ALTER TABLE orders SET (autovacuum_analyze_scale_factor = 0.02); -- Was 0.1
This tells autovacuum to run after 5% of the table has dead rows, instead of the default 20%. Do this on your most frequently updated tables.
Next Steps: From Intermediate to Expert
You've turned a 4200ms query into 0.8ms. What now?
- Instrument Everything: Use
pg_stat_statementsas your continuous performance monitor. Set up alerts on highmean_exec_timeor sudden spikes incalls. - Embrace the Extension Ecosystem: Need AI? Install
pgvector(8M+ installs and counting) for vector similarity searches. Need distributed tables? Look at Citus. Need time-series? Use TimescaleDB. PostgreSQL's superpower is its extensibility. - Plan for Scale: Connection pooling with PgBouncer isn't "advanced"—it's essential for any production web app. Implement it before you hit the connection limit error. For high-write throughput, understand that Aurora PostgreSQL can offer 3x the write capacity of standard PostgreSQL at the same instance size.
- Stay Current: PostgreSQL 17 improves sequential scan performance by 15–20% on large tables. Upgrade plans aren't just about new features; they're about free performance boosts.
The goal isn't to memorize every knob. It's to build a diagnostic reflex: when the dashboard turns red, you reach for EXPLAIN ANALYZE, check pg_stat_statements, and know whether the fix is an index, a query rewrite, or a configuration tweak. Your database should be a engine, not a bottleneck. Now go make it one.