ISR shrinkage progresses silently until data loss occurs
criticalReplicationUpdated Dec 19, 2025(via Exa)
Technologies:
How to detect:
In-sync replicas (ISR) can shrink from 3 to 2 to 1 without triggering alerts. Data loss only occurs when the final replica fails. ISR shrink rate >1/sec sustained indicates replication degradation.
Recommended action:
Alert on ISR shrinkage and under-replicated partitions immediately. Monitor both UnderReplicatedPartitions (alert when >0 for 5 min) and ISR Shrink Rate (alert when >1/sec sustained).