Buffer Size Tuning for Proxied Applications

warningCapacity Planning

Optimizing proxy_buffer_size and proxy_buffers to prevent disk I/O and improve response latency for proxied applications.

Prompt: My NGINX proxy is showing high disk I/O and slow response times when proxying to my backend application. How do I size proxy_buffers and proxy_buffer_size correctly to keep responses in memory instead of writing to disk?

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When investigating NGINX proxy buffer issues causing high disk I/O and slow response times, start by confirming disk I/O correlation with latency spikes, then measure actual response sizes against your buffer configuration. Work through gzip compression impact, response time patterns, and backend vs proxy latency before tuning buffer sizes based on real traffic patterns.

1Confirm disk I/O blocking is causing latency spikes
First, verify that disk I/O wait times correlate with response latency increases by checking system disk metrics alongside `nginx_upstream_peers_response_time`. The `master-process-disk-io-blocking-latency` insight shows that disk I/O blocking can cause multi-second response latency even when workers are idle. If you see disk I/O wait spikes (especially on magnetic EBS volumes) coinciding with p95/p99 response time increases in `nginx_upstream_peers_response_time_histogram`, you've confirmed the root cause. This validates that buffering to disk is the problem before you invest time in buffer tuning.
2Measure actual response sizes from your backend
Check `nginx_upstream_peers_received` to understand the distribution of response sizes coming from your backend application. NGINX defaults to 8KB total proxy buffers (8 * 4KB or 8 * 8KB depending on platform), so responses larger than this will spill to disk. Calculate your p95 and p99 response sizes—if they're consistently above 64KB-128KB, you're definitely writing to disk frequently. Also compare `nginx_upstream_peers_sent` to understand how much data you're sending upstream, as this affects how buffers are allocated.
3Look for bimodal response time distribution indicating buffer exhaustion
Examine `nginx_upstream_peers_response_time_histogram` for a bimodal distribution—you'll typically see fast responses (fitting in buffers) clustered at one latency and slow responses (spilling to disk) at a much higher latency. Cross-reference with `nginx_net_writing` to see how many connections are stuck in the writing state, which increases when responses are being buffered to disk. If `nginx_net_writing` is consistently high (>20% of active connections) and correlates with the slower response time bucket, you're seeing buffer exhaustion in action.
4Test if gzip compression is exhausting buffers
The `nginx-gzip-large-response-buffer-exhaustion` insight shows that gzip compression with large responses can exhaust buffers and cause CPU spikes. Temporarily disable gzip compression in your NGINX config (comment out `gzip on;`) and monitor whether `nginx_upstream_peers_response_time` improves and disk I/O decreases. Gzip can increase memory buffer requirements significantly because NGINX must decompress, buffer, and recompress responses. If disabling gzip resolves the issue, you'll need to either increase gzip buffer settings or selectively disable gzip for large responses.
5Distinguish backend slowness from proxy buffering latency
Compare `nginx_upstream_peers_header_time` (time to receive response headers) to total `nginx_upstream_peers_response_time` (time to receive complete response). If header time is 50ms but total time is 2000ms, the backend generated the response quickly but NGINX took 1950ms to buffer and transmit it—that's a buffering problem. If both metrics are high and similar, your backend is slow and buffer tuning won't help much. Also check the `response-time-degradation-without-resource-saturation` insight: if response times are elevated but `nginx_upstream_peers_active` is low and backend resources look healthy, you likely have configuration inefficiencies including buffer issues.
6Calculate and apply appropriate buffer sizes based on traffic patterns
Based on your p95 response size from `nginx_upstream_peers_received`, set `proxy_buffers` to accommodate most responses in memory. For example, if p95 is 256KB, configure `proxy_buffers 32 8k;` (256KB total) or `proxy_buffers 64 4k;` (256KB total). Set `proxy_buffer_size` to 2-4x your typical response header size (usually 4k-8k is sufficient). If you can't avoid disk writes entirely due to very large responses, configure `proxy_temp_path` to use a tmpfs filesystem to avoid physical disk blocking, as recommended in the `master-process-disk-io-blocking-latency` insight. Monitor `nginx_upstream_peers_response_time_histogram` after changes to verify the slow response bucket disappears.
7Verify improvements and check for error rate changes
After tuning buffers, monitor `nginx_server_zone_responses_5xx` to ensure you haven't introduced errors from oversized buffer allocations exhausting worker memory. Also verify that `nginx_upstream_peers_response_time` p95/p99 has improved and `nginx_net_writing` has decreased. The goal is to see response times normalize without increasing error rates. If you see 5xx errors increase after buffer tuning, you've allocated too much memory per connection and need to reduce buffer sizes or increase worker_processes to handle the memory load.

Technologies

Related Insights

Relevant Metrics

Monitoring Interfaces

NGINX Datadog
NGINX OpenTelemetry