Slow Response Time and Latency Diagnosis

warningIncident Response

Diagnosing whether slow response times originate from NGINX configuration, network issues, or backend application performance.

Prompt: “My application is experiencing slow response times and I need to figure out if the bottleneck is in NGINX itself or in my backend services. How do I use request_time vs upstream_response_time to isolate where the latency is coming from?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When diagnosing slow response times with NGINX, the critical first step is comparing upstream response time to total request time to isolate whether the bottleneck is in NGINX or the backend. Then check if the latency affects all requests uniformly or just tail percentiles, which reveals whether you're dealing with capacity saturation, configuration issues, or intermittent backend problems. Finally, investigate connection pooling and backend-specific issues like event loop blocking.

1Compare upstream response time to identify the bottleneck location

Start by comparing `nginx-upstream-peers-response-time` (time waiting for backend) to your total request time logged in NGINX ($request_time). If upstream response time accounts for >80% of total request time, the backend is your bottleneck. If upstream time is low but total time is high, the issue is in NGINX's handling—network, SSL termination, client connection speed, or NGINX configuration. Also check `nginx-upstream-peers-header-time`—if it's significantly lower than response time, the backend processes quickly but sends large responses slowly.

nginx_upstream_peers_response_timenginx_upstream_peers_header_time

2Check if latency is uniform or concentrated in tail percentiles

Look at `nginx-upstream-peers-response-time-histogram` to see the distribution—compare `nginx-upstream-peers-response-time-histogram-median` to p95/p99 percentiles. If median is acceptable (say <100ms) but p95 >250ms or p99 >500ms, you have tail latency issues rather than systemic slowness. Per the insight on response-time-degradation-without-resource-saturation, this pattern with low `nginx-upstream-peers-active` often indicates configuration inefficiencies like poor keepalive settings, excessive DNS lookups, or cache misses rather than true capacity problems.

Response Time Degradation Without Resource Saturation Latency Distribution Drift Under Sustained Load nginx_upstream_peers_response_time_histogramnginx_upstream_peers_response_time_histogram_avgnginx_upstream_peers_response_time_histogram_median

3Verify if backend capacity is saturated or artificially limited

Check `nginx-upstream-peers-active` against your backend's maximum connection limit and compare request rate (`nginx-upstream-peers-requested` delta) to response rate (`nginx-upstream-peers-responses` delta)—they should match closely. If active connections are low (<50% of backend max) but latency is high, you likely have event-loop-blocking-causes-serial-request-processing: your async backend is making blocking I/O calls that cause serial-like request handling despite async infrastructure. Also check `nginx-server-zone-processing`—if it's growing, requests are queuing in NGINX waiting for upstream capacity.

Event Loop Blocking Causes Serial Request Processing nginx_upstream_peers_activenginx_upstream_peers_requestednginx_upstream_peers_responsesnginx_server_zone_processing

4Distinguish between backend processing delay and data transfer issues

Compare `nginx-upstream-peers-header-time` to `nginx-upstream-peers-response-time`. If header time is high, your backend is slow to start processing requests (database queries, authentication, business logic). If header time is acceptable but total response time is much higher, the issue is transferring the response body—either the responses are very large, network bandwidth is constrained, or there's compression overhead. This distinction tells you whether to optimize backend logic or adjust response sizes and buffering.

nginx_upstream_peers_header_timenginx_upstream_peers_response_time

5Investigate connection pooling and keepalive efficiency

Check `nginx-net-writing` (connections waiting on upstream or writing responses) alongside `nginx-upstream-peers-active`. If net-writing is high relative to active upstream connections, NGINX is spending significant time writing responses back to slow clients or waiting on backends. Review your keepalive settings (keepalive directive in upstream block)—insufficient keepalive connections force NGINX to repeatedly establish new connections to backends, adding latency. The response-time-degradation-without-resource-saturation insight specifically calls out suboptimal keepalives as a common cause of tail latency.

Response Time Degradation Without Resource Saturation nginx_net_writingnginx_upstream_peers_active

6Profile backend application if upstream is confirmed as the bottleneck

If steps above confirm high `nginx-upstream-peers-response-time` is driving the problem, instrument your backend application. For async frameworks (FastAPI, Node.js), measure event loop lag to detect blocking operations—synchronous database calls, CPU-heavy JSON processing, or blocking SDK calls that violate async patterns. The elevated-request-duration-degradation insight suggests correlating backend latency with database query performance, external API calls, and worker capacity. Use application profiling tools (py-spy for Python, clinic.js for Node) to identify hot code paths.

Event Loop Blocking Causes Serial Request Processing Elevated request duration indicates performance degradation nginx_upstream_peers_response_time