Load Balancing Algorithm Selection and Upstream Health

warningProactive Health

Choosing the right load balancing algorithm (round-robin, least_conn, ip_hash) and configuring health checks for optimal traffic distribution.

Prompt: “My NGINX load balancer is using round-robin but some backend servers are getting overloaded while others are idle. How do I choose between least_conn, ip_hash, and weighted round-robin based on my application characteristics and connection patterns?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When round-robin causes uneven load despite equal connection distribution, the root cause is usually long-lived connections (HTTP/2, WebSockets, gRPC) where requests-per-connection vary wildly, not connection count. Start by comparing active connections to total request distribution, then check if backends have different performance characteristics or if your application requires session affinity before choosing between least_conn, ip_hash, or weighted balancing.

1Compare active connections vs request distribution across backends

First, look at `nginx-upstream-peers-active` and `nginx-upstream-peers-requested` across all upstreams. If active connections are roughly equal (say 50, 48, 52 per backend) but total requests are wildly different (100K vs 500K vs 50K), you've identified the problem: round-robin distributes connections evenly, but with long-lived HTTP/2, WebSocket, or gRPC connections, individual connections carry vastly different request loads. This is the most common cause of 'uneven' round-robin and points you toward least_conn as the solution.

nginx_upstream_peers_activenginx_upstream_peers_requested

2Check if backends have different performance characteristics

Compare `nginx-upstream-peers-response-time` across backends—if some are consistently >2x slower than others, you have capacity imbalance. Also check `nginx-upstream-peers-fails` and the ratio of `nginx-upstream-peers-responses-5xx` to `nginx-upstream-peers-responses-2xx`. Backends showing elevated failures or slow response times are under-provisioned or degraded, which makes them poor candidates for equal distribution. In this case, either use weighted round-robin with `nginx-upstream-peers-weight` favoring the faster backends, or switch to least_conn to automatically route to less-loaded servers.

nginx_upstream_peers_response_timenginx_upstream_peers_failsnginx_upstream_peers_responses_5xxnginx_upstream_peers_responses_2xxnginx_upstream_peers_weight

3Review backend health check status for degradation signals

Check `nginx-upstream-peers-health-checks-fails` and `nginx-upstream-peers-health-checks-unhealthy` on the 'overloaded' backends. The insight on `health-check-failures-indicate-upstream-degradation` shows that health check failures often precede visible user errors—if these backends are intermittently failing health checks or showing elevated `nginx-upstream-peers-unavail` counts, they're hitting resource limits (CPU, memory, event loop saturation). This confirms they can't handle equal load and need either capacity adjustments or least_conn to adaptively reduce their traffic.

Health Check Failures Indicate Upstream Degradation nginx_upstream_peers_health_checks_failsnginx_upstream_peers_health_checks_unhealthynginx_upstream_peers_unavail

4Identify if connections are long-lived and causing request imbalance

If `nginx-upstream-peers-active` stays persistently high (connections lasting minutes or hours) and correlates with uneven request distribution, you have long-lived connection patterns—common with HTTP/2 multiplexing, gRPC, or WebSocket workloads. The `load-balancer-queue-buildup-from-backend-concurrency-limits` insight describes how backends can appear busy despite available CPU when event loops are saturated with requests over long-lived connections. For these workloads, least_conn is essential because it routes new connections to backends with fewer active connections, automatically balancing the load that round-robin cannot see.

Load Balancer Queue Buildup from Backend Concurrency Limits nginx_upstream_peers_activenginx_upstream_peers_requested

5Determine if your application requires session affinity

Check if your application uses stateful sessions or persistent WebSocket connections. The insights on `load-balancer-no-sticky-sessions` and `session-affinity-not-configured` show that without session persistence, Socket.io clients oscillate between backends causing handshake resets, and session-based apps break when requests hit different servers. If session affinity is required, use ip_hash (for client IP-based stickiness) or configure cookie-based sticky sessions—do not use least_conn for stateful workloads, as it will break session continuity. Session affinity overrides performance-based routing.

Load balancer without sticky sessions causes repeated handshake resets Session requests routed to wrong worker break user sessions

6Select the appropriate algorithm based on your findings

Based on the diagnostic results: use least_conn for long-lived connections (HTTP/2, gRPC) or when backends have varying response times; use ip_hash or cookie-based sticky sessions when session affinity is required (WebSockets, stateful sessions); use weighted round-robin with `nginx-upstream-peers-weight` when backends have different capacities but connections are short-lived. For most modern applications with HTTP/2 and connection pooling, least_conn outperforms round-robin because it adapts to actual backend load, not just connection count. Monitor `nginx-upstream-peers-active` post-change to verify more even distribution.

nginx_upstream_peers_weightnginx_upstream_peers_active