When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When right-sizing a PostgreSQL instance, start by determining whether CPU spikes correlate with actual performance degradation rather than just resource utilization percentages. Then investigate memory efficiency through cache hit ratios and disk spills — many instances appear under-provisioned but are actually misconfigured. Finally, check I/O patterns and query efficiency to distinguish between needing more resources versus needing better tuning.
1Correlate CPU spikes with query performance degradation
Look at `postgresql-active-queries` and `postgresql-backends` during those 85% CPU spike windows — if query counts aren't spiking and there's no queuing, your current provisioning is fine and those bursts are normal. CPU utilization alone doesn't tell you if you're under-provisioned; what matters is whether queries are slowing down or backing up during spikes. In Cloud SQL, brief spikes to 85% with good query performance indicate healthy burst behavior, not a need to scale up.
2Check buffer cache hit ratio for memory efficiency
Examine `buffer-hit` ratio — it should be above 99% for typical OLTP workloads and above 95% for mixed workloads. A low hit ratio suggests either insufficient shared_buffers (see `insufficient-shared-buffers-cache`) or a working set larger than available memory. However, check your shared_buffers configuration first: Cloud SQL defaults are often conservative, and increasing shared_buffers to 25% of RAM can dramatically improve cache efficiency without scaling up the instance.
3Look for queries spilling to disk due to insufficient work_mem
Check `postgresql-temp-files` and `postgresql-temp-bytes` — any non-zero values mean queries with sorts, aggregations, or hash joins are spilling to disk because work_mem is too low (typically defaults to just 4MB). This causes massive slowdowns and makes your instance appear CPU/IO-bound when the real issue is memory configuration. As noted in `work-mem-disk-spill` and `insufficient-work-mem`, increasing work_mem to 16-64MB for OLTP or 128MB+ for analytics can eliminate this bottleneck without resizing the instance.
4Examine I/O wait times during CPU spikes
Review `postgresql-blk-read-time` and `postgresql-blk-write-time` during your CPU spike windows — high block read times (>5-10ms average) indicate I/O bottleneck rather than CPU constraint. If I/O wait is low during CPU spikes, you're truly CPU-bound and might benefit from more vCPUs. If I/O wait is high, the problem is disk throughput or inefficient queries causing excessive reads, not insufficient CPU, and scaling up CPU cores won't help much.
5Identify inefficient sequential scans on large tables
Look at `postgresql-sequential-scans` on tables over 1M rows (correlate with `postgresql-database-size` to identify large tables) — frequent sequential scans generate unnecessary CPU and I/O load that makes your instance appear under-provisioned when the real issue is missing indexes. High sequential scan counts during your CPU spikes suggest that adding indexes could eliminate the need to scale up. This is especially common after schema changes or with poorly optimized application queries.
6Verify shared_buffers isn't oversized causing double buffering
If shared_buffers is set above 40% of total RAM, you're likely experiencing double buffering (see `double-buffering-shared-buffers-oversized`) where the same data pages exist in both PostgreSQL's cache and the OS page cache, wasting memory while other hot pages get evicted. This makes the instance appear memory-constrained when it's actually misconfigured. Check your Cloud SQL shared_buffers parameter — if it's over 40%, reduce it to 25% and monitor whether cache performance improves without scaling up.