Apache ZooKeeper

Latency Spike Threatens Client Session Stability

warning
latencyUpdated Feb 6, 2026

ZooKeeper's heartbeat mechanism means high server latency directly causes client disconnects. When average latency exceeds 50ms or max latency spikes above 100ms, clients may fail to send heartbeats within their session timeout window.

How to detect:

Alert when zookeeper.latency.avg exceeds 50ms for 5+ minutes, or when zookeeper.latency.max exceeds 100ms. Cross-reference with disk I/O metrics, JVM GC pause times, and CPU wait times to identify the bottleneck.

Recommended action:

Use hdparm to test disk I/O performance, verify no memory swapping is occurring, check for JVM GC pauses in logs, and examine network interface errors with ifconfig/ethtool. If disk I/O is the issue, ensure transaction logs are on dedicated SSDs separate from data snapshots.