MinIO

Replication Lag Accumulation

warning
ReplicationUpdated Jul 24, 2024

Bucket or site replication can fall behind due to network issues, target unavailability, or insufficient replication workers, causing growing backlogs that impact RPO objectives.

How to detect:

Track replication worker queue depth, objects_pending_replication growing over time, replication_active_workers at maximum while backlog increases, or replication_throughput_bytes dropping below baseline during normal load.

Recommended action:

Use 'mc admin replicate resync status' to inspect replication backlog details. Verify target cluster health and network connectivity with hperf. Check if replication workers are saturated (CPU/network bound). Use 'mc admin replicate resync start' to manually trigger replication recovery. Alert if backlog exceeds RPO tolerance (e.g., >1 hour of writes).