Confluent PlatformApache Kafka

ISR shrinkage progresses silently until data loss occurs

critical
ReplicationUpdated Dec 19, 2025(via Exa)
How to detect:

In-sync replicas (ISR) can shrink from 3 to 2 to 1 without triggering alerts. Data loss only occurs when the final replica fails. ISR shrink rate >1/sec sustained indicates replication degradation.

Recommended action:

Alert on ISR shrinkage and under-replicated partitions immediately. Monitor both UnderReplicatedPartitions (alert when >0 for 5 min) and ISR Shrink Rate (alert when >1/sec sustained).