Pod CPU Throttling Despite Available Node Capacity
criticalPods hit CPU limits and experience throttling even when the underlying node has spare CPU capacity, indicating application-level saturation rather than node exhaustion. This leads to serial request processing, inflated tail latency, and degraded throughput.
Monitor pod CPU usage approaching pod limits (>95%) while node CPU utilization remains moderate (<60%). Watch for increasing p95/p99 latency, request queue buildup, and throughput plateauing despite available node capacity. Check for CPU throttling metrics and event loop blocking patterns.
Increase pod CPU limits vertically (e.g., from 3 to 4-6 cores) or scale replicas horizontally to distribute load. For event-loop applications, refactor blocking operations to non-blocking patterns using worker pools. Monitor thread counts and identify CPU-intensive code paths for optimization.