Request Queue Buildup Indicates Worker Exhaustion
criticalWhen NGINX reaches MaxRequestWorkers (or event MPM's equivalent capacity), new requests queue at the load balancer level, causing gateway timeouts even when CPU and dependencies appear healthy. This often manifests as moderate CPU (~50-60%) but rising tail latency and 502/504 errors.
Monitor nginx_server_zone_processing for sustained increases alongside rising nginx_upstream_peers_response_time. Watch for nginx_server_zone_responses_5xx increases (502/504 codes). Check if worker processes are at capacity while CPU remains < 70%, indicating worker thread starvation rather than compute bottleneck.
Increase MaxRequestWorkers and ServerLimit together (ServerLimit must be raised first). For event MPM: MaxRequestWorkers = ServerLimit × ThreadsPerChild. Ensure OS limits (ulimit -n, net.core.somaxconn, net.ipv4.ip_local_port_range) support increased connections. Before scaling workers, verify no upstream bottlenecks (blocking I/O, database locks) are causing artificial worker saturation.