ClickHouseApache ZooKeeper

Replication Lag Causing Data Inconsistency

critical
ReplicationUpdated Feb 6, 2026

In replicated table setups, replication lag increases, causing reads from replicas to return stale data and potentially triggering failover delays or data loss on node failure.

How to detect:

Monitor ReplicatedChecks, ReplicatedFetch, and ReplicatedSend metrics. Track replication queue size and lag time from system.replication_queue. Alert when lag exceeds acceptable thresholds (e.g., >60 seconds).

Recommended action:

Investigate network latency between replicas and ZooKeeper. Check ZooKeeper health and connection metrics. Verify sufficient background_schedule_pool_size for replication tasks. Review recent schema changes or large inserts that may have overwhelmed replication. Scale ZooKeeper if request load is high.