NGINXFastAPI

Latency Distribution Drift Under Sustained Load

warning
latencyUpdated Jan 10, 2026

Services with hidden event loop blocking show subtle latency drift over time, where median response times remain acceptable but p95/p99 percentiles creep upward over weeks. This gradual degradation becomes acute during traffic spikes, revealing that the service is operating with marginal concurrency headroom.

How to detect:

Track the delta between median and p95/p99 latencies over multi-week periods. A widening gap—where median stays flat but tail latencies trend upward—suggests growing event loop contention. Compare latency distributions during steady-state traffic versus burst periods; disproportionate p99 increases during bursts indicate blocking operations limiting concurrency. Monitor for latency behavior that appears 'serial-like' where increasing concurrent requests causes linear latency growth instead of staying flat.

Recommended action:

Establish baseline latency distributions and set alerts on widening percentile gaps over rolling windows (e.g., p99/median ratio). Use targeted load tests that ramp concurrency while measuring latency distributions to identify the point where serial-like behavior emerges. Focus investigation on endpoints that show the largest percentile spread increases. Instrument critical request paths with detailed timing to identify blocking sections that disproportionately affect tail latencies.