Storage Performance Bottleneck
warningIncident Response
Disk I/O latency or throughput limitations are impacting database performance.
Prompt: “My PostgreSQL queries are slow and I see high disk I/O wait times. How do I determine if I need faster storage, more IOPS, or if there's a configuration issue I can fix?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When diagnosing storage performance issues in PostgreSQL, start by verifying you're actually bottlenecked on disk rather than configuration. Check buffer cache hit ratio and temp file usage first — these often reveal tunable config problems that masquerade as storage issues. Only after ruling out memory and configuration problems should you measure actual I/O latency and determine if you need faster storage.
1Check buffer cache hit ratio to rule out memory issues
The first thing I check is `postgresql-buffer-hit` — if you're below 99%, you're doing excessive disk reads that could be avoided by increasing `shared_buffers`, not by buying faster storage. A 95% hit ratio means 5% of reads hit disk, which will feel like an I/O bottleneck even on fast NVMe. This is the most common false-positive for "slow storage" complaints. If your hit ratio is low and you have available RAM, bump shared_buffers before investigating storage.
2Look for temporary file usage indicating memory spills
Check `temp-size` to see if queries are spilling to disk because `work_mem` is too low. If you're writing gigabytes of temp files per hour, that's entirely avoidable I/O that has nothing to do with storage speed. Sorts, hash joins, and aggregations that don't fit in work_mem create massive disk churn. Increasing work_mem (carefully — it's per-operation) often eliminates what looks like a storage bottleneck.
3Measure actual I/O latency to confirm storage is the problem
Now measure real I/O latency using `postgresql-io-read-time` and `postgresql-io-write-time` divided by operation counts from `read`. If average read latency consistently exceeds 10ms or write latency exceeds 20ms, you likely have a genuine storage bottleneck. Compare these numbers to your storage tier's SLA — cloud volumes often have published IOPS and latency limits. This tells you if you've hit a hard limit or if something else is wrong.
4Analyze checkpoint write times for IOPS saturation
Look at `checkpoint-write-time` — if checkpoints are taking multiple seconds to write dirty buffers, you're likely hitting your IOPS limit. Cross-reference with checkpoint frequency (typically every 5-15 minutes by default). If checkpoints cause user-visible latency spikes, you can tune `checkpoint_completion_target` to spread writes more smoothly, but if that doesn't help, you're genuinely saturating storage and need more IOPS.
5Identify hot tables and indexes with per-object I/O stats
Drill into `blk-read-time`, `blk-write-time`, and `postgresql-disk-read` at the table/index level using pg_statio_user_tables and pg_statio_user_indexes. Often 80% of your I/O comes from 20% of your objects. If a few tables dominate, check for missing indexes (sequential scans on large tables), table bloat (vacuum issues), or hot spots that could be partitioned or cached differently. This reveals whether you need faster storage everywhere or just optimization for specific objects.
6Match storage tier to workload characteristics
Finally, analyze your I/O pattern: look at the ratio of `read-time` to `write-time` and the randomness of access. OLTP workloads need high random read/write IOPS (look for NVMe or provisioned IOPS SSD), while analytical or reporting workloads need high sequential throughput (throughput-optimized HDD may suffice). If you're on general-purpose SSD doing 50K random IOPS, you need a faster tier. If you're on HDD doing random I/O, you need a fundamentally different storage type.
Technologies
Related Insights
I/O bottleneck
Stub insight created from scenario discovery hint: I/O bottleneck
IOPS limit reached
threshold_indicator
Stub insight created from scenario discovery hint: IOPS limit reached
need for faster storage tier
prerequisite
Stub insight created from scenario discovery hint: need for faster storage tier
storage latency exceeding threshold
threshold_indicator
Stub insight created from scenario discovery hint: storage latency exceeding threshold
storage type insufficient for workload
prerequisite
Stub insight created from scenario discovery hint: storage type insufficient for workload
I/O pattern optimization needed
supporting_evidence
Stub insight created from scenario discovery hint: I/O pattern optimization needed
Relevant Metrics
readsblk_read_time(primary_diagnostic)blk_write_time(primary_diagnostic)postgresql.io.read_time(primary_diagnostic)postgresql.io.write_time(primary_diagnostic)buffer_hit(secondary_diagnostic)checkpoint_write_time(monitor)disk_read(primary_diagnostic)read_time(primary_diagnostic)write_time(primary_diagnostic)temp_bytes(secondary_diagnostic)Monitoring Interfaces
PostgreSQL Native