Apache Kafka

Offline Partitions Indicate Total Partition Unavailability

critical
availabilityUpdated Mar 2, 2026

When partitions go offline (all replicas unavailable), they cannot serve produce or consume requests, causing complete unavailability for affected data.

Technologies:
How to detect:

Monitor kafka.replication.offline_partitions_count > 0 or kafka.partition.offline = 1 for any partition. This indicates complete partition failure requiring immediate attention.

Recommended action:

1. Identify affected topics: Determine which topics have offline partitions. 2. Check all replicas: Verify status of all replicas for offline partitions. 3. Restore replicas: Bring broker(s) hosting replicas back online. 4. Check for data loss: If unclean.leader.election.enable is false, may need to enable temporarily to restore availability. 5. Review logs: Check broker logs for errors causing replica failures. 6. Verify disk health: Ensure disk/storage is functional on affected brokers. 7. Consider increasing replication factor: Add more replicas to prevent future offline scenarios.