Keepalive Connection Pooling to Upstream Servers

warningProactive Health

Configuring upstream keepalive connections to reduce TCP handshake overhead and prevent port exhaustion when connecting to backends.

Prompt: “I'm seeing a lot of TIME_WAIT sockets and my NGINX server is running out of ephemeral ports when connecting to backends. How should I configure upstream keepalive connections to reuse connections and avoid port exhaustion?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When investigating upstream port exhaustion in NGINX, start by confirming high connection churn rates and checking whether keepalive is actually configured. Then verify the HTTP/1.1 protocol requirements are met, tune the keepalive pool size to match your traffic volume, and finally check for connection pool saturation or backend health issues that prevent connection reuse.

1Confirm connection churn and lack of keepalive reuse

Check `nginx-net-backend-opened-per-s` to see how rapidly new connections are being established to backends. If this rate is high (hundreds or thousands per second) while `nginx-upstream-keepalive` shows zero or very low idle connections, you're creating a new TCP connection for every request instead of reusing them. Cross-reference with `nginx-upstream-peers-requested` to calculate the ratio—ideally you want 10-100+ requests per new connection opened.

Nginx upstream keepalive connections improve load balancer performance nginx_net_backend_opened_per_snginx_upstream_keepalivenginx_upstream_peers_requested

2Verify upstream keepalive is configured

The most common cause of this issue is simply missing the keepalive directive in your upstream block. Check your NGINX config for `keepalive N` within the upstream block (e.g., `keepalive 32`). If it's absent, that's your root cause—NGINX defaults to closing connections after each request. As noted in the `nginx-keepalive-upstream-configuration` insight, without this directive, every request incurs full TCP handshake overhead.

Nginx upstream keepalive connections improve load balancer performance

3Ensure HTTP/1.1 and proper headers are set

Even with keepalive configured, NGINX defaults to HTTP/1.0 for proxy connections, which doesn't support persistent connections. Verify you have `proxy_http_version 1.1;` and `proxy_set_header Connection "";` in your location blocks. The empty Connection header overrides any "close" directive from clients. Without these, backends will close connections after each response regardless of your keepalive setting.

Nginx upstream keepalive connections improve load balancer performance nginx_net_backend_opened_per_s

4Size the keepalive pool for your traffic volume

Compare `nginx-upstream-peers-active` (concurrent active connections) to your configured keepalive pool size. If active connections consistently exceed your keepalive value (e.g., 50 concurrent connections but keepalive 16), you're forcing connection creation during traffic spikes. For high-throughput APIs, start with keepalive 64-128. Monitor `nginx-net-backend-opened-per-s`—it should drop significantly after increasing the pool size to match your concurrency needs.

Nginx upstream keepalive connections improve load balancer performance nginx_upstream_peers_activenginx_net_backend_opened_per_snginx_upstream_keepalive

5Check for connection pool saturation blocking workers

Watch `nginx-upstream-peers-active` approaching backend capacity limits—if your PHP-FPM has pm.max_children=12 but NGINX tries to open 50 concurrent connections, you'll exhaust the backend pool. The `upstream-connection-pool-saturation-blocks-nginx-workers` insight warns this causes connection failures and forces reconnection attempts. Check `nginx-upstream-peers-fails` for rising failure counts, which indicate backends rejecting connections. Set per-upstream limits matching backend capacity to fail fast rather than saturating pools.

Upstream Connection Pool Saturation Blocks NGINX Workers nginx_upstream_peers_activenginx_upstream_peers_fails

6Tune keepalive timeout for your workload pattern

If `nginx-upstream-keepalive` shows connections being maintained but `nginx-net-backend-opened-per-s` is still elevated during traffic valleys, your keepalive_timeout may be too low, causing premature closure. The `keepalive-misconfiguration-exhausts-connections` insight recommends 2-5s for API endpoints with short requests, 5-15s for static content. Too high wastes resources; too low defeats the purpose. Monitor `nginx-upstream-peers-response-time` to ensure timeout exceeds typical response latency by 2-3x.

KeepAlive Misconfiguration Exhausts Connections nginx_upstream_keepalivenginx_net_backend_opened_per_snginx_upstream_peers_response_time

7Investigate backend health if connection reuse remains low

If keepalive is properly configured but `nginx-upstream-keepalive` stays near zero while `nginx-upstream-peers-fails` increases, backends may be actively closing connections or timing out. Check `nginx-net-backend-dropped-per-s` for abnormal drops and correlate with backend logs. Backends that return `Connection: close` headers, have aggressive timeout settings, or are restarting frequently will prevent NGINX from maintaining the keepalive pool, forcing constant reconnection regardless of your NGINX configuration.

nginx_upstream_keepalivenginx_upstream_peers_failsnginx_net_backend_dropped_per_s