Kafka Consumer Group Rebalance Storm Triggering Lambda Restarts
warningFrequent Kafka consumer group rebalances (detected via kafka_consumergroup_members changes) can trigger Lambda function restarts (fullRestarts metric), causing processing interruptions, increased cold starts (InitDuration), and temporary offset lag spikes as Lambda event source mappings rejoin the consumer group.
Track Lambda fullRestarts and downtime metrics spiking in correlation with kafka_consumergroup_members changes or kafka.consumer.delayed_requests increases. Monitor InitDuration spikes following fullRestarts indicating cold start penalties after rebalance.
Increase Kafka consumer group session.timeout.ms and heartbeat.interval.ms to reduce rebalance sensitivity. Use Lambda provisioned concurrency to minimize cold start impact after rebalances. Review Kafka broker logs for rebalance triggers. If using Lambda provisioned mode, ensure stable MinimumPollers configuration to reduce poller churn.