Consumer Fetch Latency Spike from Broker Overload
warningWhen consumer fetch latency increases significantly, it indicates the broker is slow to respond to fetch requests, often due to disk I/O, CPU saturation, or competing produce traffic.
Monitor kafka.consumer.fetch_latency_avg or kafka.request.fetch_consumer_time_avg exceeding baseline by 2x. Cross-reference with kafka.request.handler_idle_percent and kafka.consumer.fetch_rate to identify if broker overload or consumer behavior change.
1. Check broker load: Monitor CPU, disk I/O, and network on broker. 2. Review competing traffic: Check if produce traffic is interfering with fetch requests. 3. Increase broker capacity: Scale vertically or horizontally if overloaded. 4. Tune fetch.min.bytes: Adjust to allow batching more data per fetch. 5. Check disk performance: Slow disk reads increase fetch latency. 6. Monitor partition leadership: Unbalanced leadership can overload specific brokers.