Load Balancer Queue Buildup from Backend Concurrency Limits
criticalWhen backend FastAPI workers experience event loop blocking, NGINX ingress accumulates requests in upstream queues as backends appear busy despite having available CPU capacity. This results in increased queue wait times and eventually gateway timeouts for clients, creating a cascading latency problem.
Monitor NGINX upstream request queue times and correlate with backend response times. When queue times increase while backend CPU remains moderate and database/cache metrics are healthy, the issue is likely backend concurrency saturation from event loop blocking. Watch for the pattern where effective RPS per pod drops before CPU or memory limits are reached.
Instrument both NGINX and backend applications to correlate queue times with backend event loop behavior. Set alerts on rising upstream queue times that don't correlate with backend resource saturation. Investigate backend async endpoints for blocking operations that limit actual concurrency. Consider temporarily increasing NGINX upstream timeout thresholds while backend blocking issues are resolved, but treat this as a stopgap, not a solution.