Raft Commit Latency Spike Delays Metadata Propagation
warningIn KRaft mode, high Raft commit latency delays metadata changes from propagating through the cluster, causing stale metadata and operational delays.
Monitor kafka.raft.commit_latency_avg or kafka.raft.commit_latency_max exceeding 1 second. Cross-reference with kafka.raft.append_records_rate to check write volume.
1. Check controller disk I/O: Verify metadata log storage performance. 2. Review controller CPU: Ensure controllers have sufficient CPU capacity. 3. Check network latency: Verify inter-controller network performance. 4. Monitor quorum size: Larger quorums have higher commit latency. 5. Review metadata change rate: Excessive metadata changes can overwhelm controllers. 6. Optimize client operations: Reduce frequency of metadata-changing operations.