Memory Leak Diagnosis in NGINX Workers
criticalIncident Response
Diagnosing and resolving memory leaks in NGINX worker processes that cause gradual memory growth and eventual performance degradation.
Prompt: “My NGINX worker processes are showing steadily increasing memory usage over time, eventually leading to OOM kills. How do I determine if this is a configuration issue, an application memory leak, or an actual NGINX bug?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When diagnosing NGINX worker memory leaks, start by establishing whether all workers or only some are affected to differentiate config issues from request-specific leaks. Then check for OOM kills and shared memory zone exhaustion, which are the most common NGINX-specific causes. Finally, rule out upstream application leaks and configuration issues like unbounded caching or request bodies before suspecting an actual NGINX bug.
1Establish the memory growth pattern across workers
Track individual worker RSS over time using `ps aux | grep nginx:worker` or your monitoring stack. If all workers grow at the same rate, you're looking at a configuration issue affecting all traffic (shared memory, cache, or config-driven buffering). If only some workers grow while others stay stable, it's likely specific requests or connections triggering the leak, pointing toward either an NGINX bug or upstream application issue.
2Check for OOM kills and worker respawning
Monitor the `nginx-processes-respawned` metric to see if workers are being abnormally terminated and restarted. If this counter is increasing, the OS is killing workers due to memory exhaustion. Cross-reference with dmesg or /var/log/syslog for 'Out of memory: Kill process' messages to confirm OOM kills. Frequent respawning confirms you have a real memory leak, not just high but stable memory usage.
3Inspect shared memory zone exhaustion
Check `nginx-slab-slot-fails` - if this counter is incrementing, you have shared memory exhaustion in zones like limit_req, limit_conn, ssl_session_cache, or custom zones. Compare `nginx-slab-pages-used` against `nginx-slab-pages-free` to see utilization percentage. This is one of the most common NGINX-specific memory issues and indicates you need to increase zone sizes or reduce the data being stored (e.g., shorter SSL session timeouts, smaller key sizes).
4Verify proxy cache isn't growing unbounded
Monitor `nginx-cache-size` and compare it to `nginx-cache-max-size`. If cache size is consistently at or near max, the cache manager may be struggling to evict old entries fast enough, causing memory pressure. If `nginx-cache-max-size` is not set or cache size keeps growing, you have unbounded cache growth. Also check if `proxy_cache_path` has `inactive` and `max_size` properly configured.
5Test for unbounded request body memory exhaustion
The `unbounded-upload-memory-exhaustion` insight shows that large uploads without size limits can exhaust memory. Verify `client_max_body_size` is set to a reasonable value (not overly large). Test by sending large POST/PUT requests while monitoring worker memory - if memory spikes and doesn't release after request completion, the upstream application may be buffering entire request bodies in memory rather than streaming them.
6Differentiate NGINX leaks from upstream application leaks
The `gunicorn-multiple-sites-memory-exhaustion` and `gunicorn-memory-swapping-causes-504-timeouts` insights highlight that upstream workers (Gunicorn, uWSGI, etc.) often leak memory, not NGINX itself. Monitor upstream process memory separately using `ps aux | grep gunicorn` or equivalent. If upstream workers are growing but NGINX workers are stable, the leak is in your application code, not NGINX. Check for swap usage with `free -m` - if swap exceeds 20-30% of total RAM, you have a capacity problem.
7Test gzip compression buffer exhaustion
Per the `nginx-gzip-large-response-buffer-exhaustion` insight, gzip compression of large responses can exhaust buffers and cause memory spikes. Temporarily disable gzip in your config (`gzip off;`) and reload NGINX, then monitor if memory growth stops or slows. If it does, tune `gzip_buffers`, reduce `gzip_min_length`, or selectively disable gzip for large responses rather than globally enabling it.
Technologies
Related Insights
Unbounded upload size exhausts memory in Starlette applications
warning
Nginx gzip compression with large responses may exhaust buffers causing CPU spike
warning
Hosting multiple sites on 1GB RAM server causes memory exhaustion
critical
Memory swapping causes slow response times and Nginx 504 timeouts
critical