Technology Scenario Category Severity Gradew/ SchemaTokensLatencyTurnsTool CallsSchema CallsResponse
PostgreSQL
Autovacuum Falling BehindMy Postgres tables are showing millions of dead tuples and autovacuum doesn't seem to be keeping up. How do I know if I need to tune autovacuum settings or if there's something blocking it?
Incident ResponsewarningA-A-2,380vs1,59141.2svs31.8s9vs53vs10vs03,065 charsvs3,506 chars
PostgreSQL
Background Writer and Checkpoint BottleneckI'm seeing high bgwriter_buffers_backend and low bgwriter_buffers_clean in PostgreSQL metrics. My write performance has degraded and I think backends are writing their own dirty buffers instead of the background writer handling it. How do I tune this?
performancewarning
PostgreSQL
Backup and Recovery PerformanceI need to verify our PostgreSQL backup strategy is working well. How do I assess backup performance, storage costs, and whether we can actually meet our recovery time objectives?
Proactive Healthinfo
PostgreSQL
Cache Hit Ratio DegradationI noticed our PostgreSQL cache hit ratio dropped from 99% to 85% over the past week. Is this a problem and should I increase shared_buffers or is something else going on?
Proactive HealthwarningB+A-1,310vs1,29624.9svs25.2s6vs52vs10vs02,728 charsvs2,996 chars
PostgreSQL
Checkpoint Tuning for Write PerformanceMy PostgreSQL instance is experiencing periodic write latency spikes. I see these correlate with checkpoints happening every few minutes. How do I tune checkpoint settings to smooth out performance?
Proactive Healthwarning
PostgreSQL
Comprehensive Operational Readiness ReviewI set up PostgreSQL a while ago and the workload has evolved a lot since then. How do I make sure it's well configured and provisioned? I want a comprehensive health check covering performance, capacity, reliability, and cost.
Proactive Healthinfo
PostgreSQL
Concurrent Index Build MonitoringI'm running CREATE INDEX CONCURRENTLY on a 200GB production table in PostgreSQL. It's been running for 2 hours and I need to know if it's progressing normally or stuck. How do I monitor the operation and what could cause it to fail?
Proactive Healthinfo
PostgreSQL
Connection Pool ExhaustionWe're getting 'FATAL: sorry, too many clients already' errors in our app logs. Our Cloud SQL PostgreSQL instance is at 95% of max_connections. What do I do right now and how do I prevent this?
Incident ResponsecriticalB+A-601vs85215.9svs19.1s2vs20vs00vs01,490 charsvs2,053 chars
PostgreSQL
Connection Timeout and Network IssuesOur app is getting intermittent 'could not connect to server' and connection timeout errors from PostgreSQL. How do I diagnose whether this is a network issue, database overload, or configuration problem?
Incident Responsecritical
PostgreSQL
Cost Optimization for Over-Provisioned InstanceOur monthly PostgreSQL bill on RDS is $5000 but CPU utilization averages 15% and memory is at 40%. How do I safely downsize to save money without risking performance issues?
Cost Optimizationinfo
PostgreSQL
Cost optimization through performance tuningMy monthly AWS RDS PostgreSQL bill is $8,000 and I need to reduce costs by 30% without degrading performance. Help me identify whether I should optimize slow queries to downgrade my instance, tune autovacuum to reduce IOPS costs, or consolidate databases.
Cost OptimizationinfoC+B-1,437vs1,64830.0svs36.2s7vs72vs20vs01,558 charsvs1,310 chars
PostgreSQL
Cross-platform metric mapping for PostgreSQLI'm migrating my PostgreSQL monitoring from Datadog to Prometheus. How do Datadog's postgresql.connections metric and AWS CloudWatch's DatabaseConnections metric map to the postgres_exporter metrics in Prometheus?
Cross PlatforminfoA-B+868vs1,10918.1svs44.1s2vs80vs30vs22,078 charsvs1,818 chars
PostgreSQL
Data Type Selection and Schema Design ProblemsI'm reviewing our PostgreSQL schema and I see we're using timestamp without time zone in many places, char(20) for some text fields, and the money type for currency. Are these bad choices and what problems might they cause in production?
Proactive Healthwarning
PostgreSQL
Deadlock TroubleshootingWe're seeing deadlock errors in our application logs several times per hour. How do I find what queries are causing deadlocks and how to prevent them?
Incident Responsewarning
PostgreSQL
Disk space exhaustion from WAL filesMy PostgreSQL pg_wal directory is consuming 80% of available disk space and growing. I have replication configured—how do I safely clean this up without breaking replication or causing data loss?
Incident ResponsecriticalB+A-1,273vs1,35525.4svs24.6s2vs20vs00vs03,427 charsvs3,755 chars
PostgreSQL
Extension Management and CompatibilityWe're using several PostgreSQL extensions like PostGIS and pg_stat_statements. How do I check which extensions are installed, their performance impact, and whether they'll work if we upgrade from PostgreSQL 13 to 15?
Migrationinfo
PostgreSQL
Fast Path Lock Exhaustion from PartitioningOur PostgreSQL queries on a partitioned table suddenly got 10x slower. I see locks showing up in pg_locks but not blocking each other. Someone mentioned fast path lock exhaustion - how do I diagnose this and what's the fix?
performancewarning
PostgreSQL
Index bloat requiring maintenanceMy PostgreSQL indexes are showing significant bloat (>30%) on several high-traffic tables. Should I run REINDEX CONCURRENTLY or use pg_repack, and how do I avoid locking out my production workload?
Proactive HealthwarningB+A-1,348vs1,67326.7svs32.6s2vs20vs00vs03,842 charsvs4,874 chars
PostgreSQL
Index Health and Usage AnalysisI want to audit my PostgreSQL indexes. Which ones are never used and safe to drop, and where might I be missing indexes that would improve query performance?
Proactive Healthinfo
PostgreSQL
Lock contention blocking transactionsMy PostgreSQL database has queries backing up because they're waiting on locks. Help me identify which queries are holding locks, which are blocked, and whether I have a deadlock situation or just long-running DDL.
Incident ResponsewarningBB+1,386vs2,67721.2svs39.9s2vs50vs10vs04,012 charsvs1,745 chars
PostgreSQL
Logical Replication Conflict ResolutionMy PostgreSQL logical replication subscription stopped working with unique constraint violation errors. The replication worker keeps failing and I'm seeing apply errors in pg_stat_subscription. How do I diagnose what's causing the conflict and safely resume replication without losing data?
Incident Responsecritical
PostgreSQL
Memory Configuration OptimizationI inherited a PostgreSQL database and the memory settings seem to be defaults from years ago. How do I determine the right values for shared_buffers, work_mem, and maintenance_work_mem based on our current instance size and workload?
Proactive Healthinfo
PostgreSQL
Memory Exhaustion and OOM Killer PreventionMy PostgreSQL server keeps crashing and the Linux OOM killer is terminating the postgres process. I have work_mem set to 256MB but we have hundreds of concurrent connections. How do I calculate safe memory limits and prevent these crashes?
Incident Responsecritical
PostgreSQL
Migration from RDS to Cloud SQLWe're planning to migrate our PostgreSQL database from AWS RDS to Google Cloud SQL. What are the key differences I need to understand in terms of performance metrics, configuration, and operational procedures?
Migrationinfo
PostgreSQL
MultiXact Member Space ExhaustionMy PostgreSQL database is throwing errors about MultiXact member space exhaustion and all writes are failing. I see messages about emergency autovacuum in the logs. What is MultiXact member space and how do I fix this before we have a complete outage?
Incident Responsecritical
PostgreSQL
N+1 Query Pattern Detection and ResolutionOur PostgreSQL connection count spikes to 200+ during busy periods and API response times are terrible. The logs show thousands of nearly identical simple SELECT queries for related records. I think we have an N+1 query problem - how do I find and fix these patterns?
performancewarning
PostgreSQL
Operational readiness reviewI set up Postgres a while ago. The workload has evolved a lot since then. How do I make sure it's well configured and provisioned?
Proactive HealthinfoB+A-1,123vs2,51523.4svs49.3s7vs233vs130vs32,386 charsvs3,590 chars
PostgreSQL
PostgreSQL migration from AWS RDS to Google Cloud SQLI'm planning to migrate our PostgreSQL database from AWS RDS to Google Cloud SQL. What are the key differences in monitoring capabilities, performance characteristics, and operational features I need to account for during and after the migration?
MigrationinfoB+A-1,196vs1,27826.9svs47.1s2vs20vs00vs04,007 charsvs4,545 chars
PostgreSQL
Query Plan Regression DetectionA query that used to run in 100ms is now taking 30 seconds, but the data volume hasn't changed much. How do I check if the query plan changed and why the optimizer might be making a bad choice now?
Incident Responsewarning
PostgreSQL
Replication Lag CrisisMy PostgreSQL read replica on RDS is showing 45 seconds of replication lag and climbing. What's causing this and how do I fix it before it impacts users?
Incident Responsecritical
PostgreSQL
Replication lag threatening data consistencyMy PostgreSQL read replicas are falling behind the primary by 30 seconds and climbing. Help me diagnose if this is a resource bottleneck, network issue, or replication slot problem before it causes an outage.
Incident ResponsecriticalB+A-7,911vs6,79612.2mvs1.8m13vs126vs60vs02,654 charsvs2,942 chars
PostgreSQL
Replication Slot Bloat and CleanupMy PostgreSQL disk is filling up with WAL files and I see an inactive replication slot that hasn't been used in days. It's preventing vacuum from cleaning up old data. Can I safely drop this slot or will it break replication?
Incident Responsewarning
PostgreSQL
Right-Sizing for Workload GrowthHelp me determine whether my Postgres deployment on Google Cloud SQL should be provisioned up or down based on my current workload. We started on this instance size 6 months ago and traffic has doubled.
Capacity Planningwarning
PostgreSQL
Right-sizing PostgreSQL cloud instancesHelp me determine whether my Postgres deployment on Google Cloud SQL should be provisioned up or down based on my current workload. I'm seeing 45% CPU utilization on average but occasional spikes to 85%.
Capacity PlanninginfoD+B+1,162vs81325.1svs18.8s6vs22vs00vs0128 charsvs1,648 chars
PostgreSQL
Slow Query DiagnosisOur app response times have doubled in the last hour. How do I quickly find which PostgreSQL queries are slow and whether they need better indexes or query optimization?
Incident Responsewarning
PostgreSQL
Slow query performance investigationSeveral of my PostgreSQL queries have gotten significantly slower over the past month. Help me identify the top offenders and determine whether this needs new indexes, query rewrites, or updated table statistics.
Proactive HealthwarningA-C+6,431vs1,5531.8mvs12.4m16vs189vs90vs23,872 charsvs1,679 chars
PostgreSQL
Slow Replica Performance InvestigationOur PostgreSQL read replica has zero replication lag but queries are running 3x slower than on the primary. What could cause this performance difference and how do I diagnose it?
Incident Responsewarning
PostgreSQL
Standby Query Conflict ResolutionOur PostgreSQL read replica is canceling long-running queries with errors about conflicting recovery processes. I see bufferpin and snapshot conflicts in pg_stat_database_conflicts. How do I balance replication consistency with allowing analytical queries to complete?
Incident Responsewarning
PostgreSQL
Storage Performance BottleneckMy PostgreSQL queries are slow and I see high disk I/O wait times. How do I determine if I need faster storage, more IOPS, or if there's a configuration issue I can fix?
Incident Responsewarning
PostgreSQL
Table Bloat Diagnosis and RemediationOne of my PostgreSQL tables is taking up 50GB of disk space but only has 10GB of actual data when I dump it. How do I measure bloat and safely reclaim this space?
Proactive Healthwarning
PostgreSQL
Temp Table and Temporary File ManagementI'm seeing hundreds of GB of temp_bytes in pg_stat_database and queries are slowing down. How do I determine if I should increase work_mem or if these queries need optimization?
Proactive Healthwarning
PostgreSQL
Transaction ID Wraparound Emergency ResponseI'm getting FATAL errors about transaction ID wraparound in PostgreSQL and the database is warning it will shut down in 1 million transactions. The age of the oldest transaction is over 2 billion. What do I need to do right now to prevent an outage?
Incident Responsecritical
PostgreSQL
Transaction ID Wraparound RiskI'm getting warnings about transaction ID wraparound in my PostgreSQL logs. The age of the oldest transaction is over 1.5 billion. How urgent is this and what do I need to do?
Incident ResponsecriticalB+A-1,098vs81424.3svs18.7s2vs20vs00vs02,971 charsvs2,097 chars
PostgreSQL
VACUUM FULL Execution PlanningI have a 500GB PostgreSQL table with 60% bloat that needs VACUUM FULL to reclaim space. How do I plan this operation for production - how long will it take, how much extra disk space do I need, and what's the impact on running queries?
Proactive Healthwarning
PostgreSQL
Version Upgrade Planning and ValidationWe need to upgrade from PostgreSQL 12 to PostgreSQL 15. What should I check for compatibility issues, and how do I validate performance won't regress after the upgrade?
Migrationinfo
PostgreSQL
WAL Accumulation and Disk Space CrisisMy PostgreSQL instance on RDS is running out of disk space and I see the WAL directory has grown to 200GB. Why isn't WAL archiving or cleanup happening and how do I fix this urgently?
Incident Responsecritical