Pod CPU Throttling Despite Available Node Capacity

critical

Resource ContentionUpdated Jan 10, 2026

Pods hit CPU limits and experience throttling even when the underlying node has spare CPU capacity, indicating application-level saturation rather than node exhaustion. This leads to serial request processing, inflated tail latency, and degraded throughput.

Sources

Case Study: Fixing FastAPI Event Loop Blocking in a High-Traffic API - techbuddies.iowww.techbuddies.io

Investigating & Resolving High-CPU Alerts in Kubernetes Podsdev.to

Technologies:

KubernetesSymptoms of this issue are visible in Kubernetes metrics and logs

FastAPIThe root cause of this issue originates in FastAPI

How to detect:

Monitor pod CPU usage approaching pod limits (>95%) while node CPU utilization remains moderate (<60%). Watch for increasing p95/p99 latency, request queue buildup, and throughput plateauing despite available node capacity. Check for CPU throttling metrics and event loop blocking patterns.

Recommended action:

Increase pod CPU limits vertically (e.g., from 3 to 4-6 cores) or scale replicas horizontally to distribute load. For event-loop applications, refactor blocking operations to non-blocking patterns using worker pools. Monitor thread counts and identify CPU-intensive code paths for optimization.