JVM Heap Pressure Manifesting as Write Timeouts

critical

Resource ContentionUpdated Jan 10, 2026

When heap usage climbs above 80-90% without GC recovery, or when GC pause times exceed 500ms, Cassandra cannot process requests during stop-the-world pauses. This manifests as WriteTimeoutException at the application layer despite healthy disk and network.

Sources

Troubleshooting Common Issues in Apache Cassandra - Nextbricknextbrick.com

Apache Cassandra Monitoring: Tools, Challenges & Best Practiceslast9.io

Apache Cassandra Monitoring with OpenTelemetry [including dashboards and alerts] | SigNozsignoz.io

Cassandra Monitoring: Metrics, Troubleshooting, and Observability with CubeAPM - CubeAPMcubeapm.com

Technologies:

CassandraThe root cause of this issue originates in Cassandra

How to detect:

Monitor cassandra heap memory usage trending above 85% for more than 5 minutes without corresponding drops. Correlate with cassandra_write_timeouts increasing and check for GC pause times (via JMX jvm.gc.collections.elapsed) exceeding 500ms. Confirm with cassandra_client_request_error showing WriteTimeoutException or OverloadedException patterns.

Recommended action:

Immediately review JVM heap size in jvm.options, ensuring -Xms and -Xmx are set appropriately (typically 8-16GB, never exceeding 50% of system RAM). Analyze GC logs to identify young vs old generation pressure. Tune garbage collector strategy (G1GC recommended). If memtables are oversized, reduce memtable_heap_space_in_mb. Consider horizontal scaling if write throughput consistently exceeds single-node capacity.