Apache Pulsar

Managed Ledger Cache Eviction Thrashing Degrades Consumer Performance

warning
Resource ContentionUpdated Dec 16, 2024

High pulsar_ml_cache_evictions rate indicates the managed ledger cache is undersized, forcing repeated cache misses that degrade consumer read performance and increase disk I/O load on BookKeeper.

How to detect:

Monitor pulsar_ml_cache_evictions for high eviction rates relative to pulsar_ml_cache_hit_throughput. Low cache hit ratio (calculated from hits vs total reads) combined with elevated pulsar_bookie_read_size and pulsar_bookie_read_cache_size indicates cache thrashing. Check pulsar_ml_cache_pool_active_allocated against configured managedLedgerCacheSizeMB.

Recommended action:

Increase managedLedgerCacheSizeMB configuration to reduce eviction pressure. Tune managedLedgerCacheEvictionWatermark and managedLedgerCacheEvictionIntervalMs to smooth eviction patterns. Consider increasing JVM heap size to accommodate larger cache. Monitor consumer receive queue size (receiverQueueSize) and reduce if consumers are over-buffering messages unnecessarily.