FastAPIEnvoy Proxy

Request Queue Buildup Under Burst Traffic

critical
scalingUpdated Jan 10, 2026

Request queue times increase at load balancer during traffic bursts despite moderate server resource utilization, indicating insufficient concurrency handling or event loop saturation.

How to detect:

Monitor load balancer queue depth and queueing time metrics. Alert when queue time exceeds 100ms or queue depth grows beyond 2x worker count. Correlate with http.server.active_requests to identify saturation point.

Recommended action:

Increase Uvicorn worker count to match CPU cores. Configure connection limits and backpressure mechanisms. Implement autoscaling based on active request count rather than CPU. Add request shedding for non-critical traffic during overload.