Apache Flink

RocksDB Write Stall Cascade

warning
Resource ContentionUpdated May 31, 2022

Slow RocksDB flushes cause write stalls that propagate upstream as backpressure, degrading throughput and increasing checkpoint durations.

How to detect:

Monitor RocksDB metrics actual-delayed-write-rate, num-running-flushes, and mem-table-flush-pending. When actual-delayed-write-rate is non-zero while num-running-flushes remains low relative to pending flushes, disk I/O is insufficient. Correlate with increasing flink_task_checkpointalignmenttime and decreasing flink_operator_recordsoutpersec.

Recommended action:

Increase RocksDB background thread concurrency via state.backend.rocksdb.thread.num to saturate available disk I/O. Verify disk throughput is adequate for workload. Consider enabling incremental checkpointing to reduce checkpoint pressure. Review RocksDB tuning for write-heavy workloads, balancing write, read, and space amplification.