JVM GC Pauses Masquerading as Database Slowness

critical

Resource ContentionUpdated Jan 10, 2026

Frequent or long garbage collection pauses appear as query timeouts and high cassandra_read_timeouts / cassandra_write_timeouts, but root cause is JVM memory pressure, not database overload. GC pauses over 500ms translate to client-side timeouts.

Sources

Critical Cassandra Performance Metrics to Monitor - Sematextsematext.com

Apache Cassandra Monitoring with OpenTelemetry [including dashboards and alerts] | SigNozsignoz.io

Latency Troubleshooting and Monitoring in Amazon Keyspaces for Apache Cassandra | AWS re:Postrepost.aws

Technologies:

CassandraSymptoms of this issue are visible in Cassandra metrics and logs

How to detect:

Monitor JVM GC metrics showing major GC events > 500ms or young gen GCs firing every second, correlated with spikes in cassandra_read_timeouts and cassandra_write_timeouts. Heap usage (jvm.memory.heap.used) stays at 80-90% without sawtooth drops indicating successful GC.

Recommended action:

Tune JVM heap size (MAX_HEAP_SIZE and HEAP_NEWSIZE in cassandra-env.sh). Review GC logs to identify allocation patterns. If heap exhaustion persists at 80-90%, either reduce memtable sizes, reduce key/row cache sizes, or add nodes to reduce per-node data volume. Consider switching garbage collectors or upgrading to newer JVM versions with better GC algorithms.