Cassandra

JVM Heap Pressure Manifesting as Write Timeouts

critical
Resource ContentionUpdated Jan 10, 2026

When heap usage climbs above 80-90% without GC recovery, or when GC pause times exceed 500ms, Cassandra cannot process requests during stop-the-world pauses. This manifests as WriteTimeoutException at the application layer despite healthy disk and network.

How to detect:

Monitor cassandra heap memory usage trending above 85% for more than 5 minutes without corresponding drops. Correlate with cassandra_write_timeouts increasing and check for GC pause times (via JMX jvm.gc.collections.elapsed) exceeding 500ms. Confirm with cassandra_client_request_error showing WriteTimeoutException or OverloadedException patterns.

Recommended action:

Immediately review JVM heap size in jvm.options, ensuring -Xms and -Xmx are set appropriately (typically 8-16GB, never exceeding 50% of system RAM). Analyze GC logs to identify young vs old generation pressure. Tune garbage collector strategy (G1GC recommended). If memtables are oversized, reduce memtable_heap_space_in_mb. Consider horizontal scaling if write throughput consistently exceeds single-node capacity.