Storage Write Latency Spikes Indicate BookKeeper Bottleneck
criticalHigh storage write latency (>1s) signals that BookKeeper ledgers cannot persist messages fast enough, creating a bottleneck that cascades to producer publish latency and overall throughput degradation.
Track pulsar_storage_write_rate latency buckets, especially messages taking >1s to persist. Correlate with pulsar_bookie_write_size and pulsar_bookie_flush metrics to identify BookKeeper storage layer saturation. Check pulsar_bookkeeper_server_add_entry_count for write throughput limits.
Tune BookKeeper journal configuration (journalBufferedWritesThreshold, journalMaxGroupWaitMSec, journalWriteBufferSizeKB) to optimize write batching and durability tradeoffs. Use multiple disks for ledgers to distribute I/O load. Scale BookKeeper cluster horizontally by adding more bookies. Monitor bookie disk I/O utilization and consider faster storage tiers.