Query Plan Regression Detection
warningIncident Response
Previously fast queries have become slow due to query planner choosing suboptimal execution plans.
Prompt: “A query that used to run in 100ms is now taking 30 seconds, but the data volume hasn't changed much. How do I check if the query plan changed and why the optimizer might be making a bad choice now?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When a query suddenly slows down without data volume changes, the first thing to confirm is that the query plan actually changed by comparing EXPLAIN ANALYZE output. Most plan regressions are caused by stale statistics, so check mod_since_analyze and last_analyze timestamps next. After ruling out statistics issues, investigate whether indexes are being ignored, data distribution has shifted, or planner cost parameters need tuning.
1Capture and compare the current query plan
Run EXPLAIN (ANALYZE, BUFFERS) on the slow query to see the actual execution plan, estimated vs. actual row counts, and I/O patterns. If you have the old plan (from pg_stat_statements or logs), compare them side-by-side. Look for changed join orders, switched from index scan to sequential scan, or drastically wrong row estimates — these are smoking guns. The `postgresql-queries-shared-blks-read` metric will spike if the planner switched to sequential scans.
2Check for stale statistics on the involved tables
Query pg_stat_all_tables for the tables in your slow query and check `n_mod_since_analyze` and `last_analyze`. If you see tens of thousands of modifications since the last ANALYZE, or last_analyze is days/weeks old, the planner is working with outdated distribution data. This is the most common cause of plan regressions — the planner thinks a table has 1000 rows when it now has 1M, or vice versa. Run ANALYZE manually on those tables and re-test the query.
3Verify indexes are being used
Compare `postgresql.sequential_scans` vs `postgresql.index_rel_scans` for the tables in the query. If you see high sequential scans on tables with millions of rows, the planner is choosing table scans over indexes. Check if the query's WHERE clause can actually use your indexes — functional expressions, type mismatches, or OR conditions can prevent index usage. Also check if the index exists but statistics are stale (back to step 2).
4Look for data distribution changes
Even with fresh statistics, if your data distribution has fundamentally changed (e.g., most rows now fall into one value after a bulk load, or you deleted 90% of a table), the planner's cost model might now prefer different plans. Check the actual row counts returned in EXPLAIN ANALYZE against the estimates — if estimates are off by 10x or more, the planner can't make good choices. Look at `postgresql.rows` trends over time to see if data volume spiked or dropped suddenly.
5Review planner cost parameters
If statistics are fresh and indexes exist but the planner still chooses sequential scans, check random_page_cost, effective_cache_size, and seq_page_cost settings. On SSDs, random_page_cost should often be lowered to 1.1-1.5 (default is 4.0). If effective_cache_size is too low, the planner underestimates how much data fits in memory and favors sequential scans. Use EXPLAIN to see estimated costs, then tune parameters and re-EXPLAIN to see if the plan improves.
6Monitor query I/O patterns over time
Track `postgresql.queries.shared_blks_read` for this specific query over time — a sudden jump indicates the planner switched from an index (reading few blocks) to a sequential scan (reading the entire table). Correlate this with `hits` vs `reads` ratio in pg_stat_database to see if the regression coincides with buffer pool pressure or a stats reset. If `stats_reset` is recent, you may have lost historical statistics that helped the planner make better choices.
Technologies
Related Insights
statistics out of date
Stub insight created from scenario discovery hint: statistics out of date
index not being used
Stub insight created from scenario discovery hint: index not being used
cost parameter tuning needed
Stub insight created from scenario discovery hint: cost parameter tuning needed
query plan regression
diagnostic_step
Stub insight created from scenario discovery hint: query plan regression
Stale statistics cause inefficient query plans with exponentially worse performance
prerequisitewarning
data distribution change affecting estimates
prerequisite
Stub insight created from scenario discovery hint: data distribution change affecting estimates
planner choosing suboptimal plan
outcome
Stub insight created from scenario discovery hint: planner choosing suboptimal plan
need for ANALYZE
supporting_evidence
Stub insight created from scenario discovery hint: need for ANALYZE
join order changed
supporting_evidence
Stub insight created from scenario discovery hint: join order changed
Relevant Metrics
postgresql.rowshitsreadspg_genericqueries_duration_max(primary_diagnostic)postgresql.queries.count(secondary_diagnostic)postgresql.sequential_scans(secondary_diagnostic)postgresql.index_rel_scans(monitor)stats_reset(monitor)last_analyze(secondary_diagnostic)postgresql.queries.shared_blks_read(secondary_diagnostic)n_mod_since_analyze(secondary_diagnostic)Monitoring Interfaces
PostgreSQL Native