High-Percentile Latency Divergence
warninglatencyUpdated Jan 10, 2026
P95 and P99 latencies diverge significantly from median/P50 latencies, indicating tail latency problems that affect user experience despite healthy average metrics.
Sources
How to detect:
Monitor when P95 latency exceeds 150ms for ingestion endpoints or 250ms for query endpoints. Alert when P99/P50 ratio exceeds 5x, indicating severe tail latency. Track latency distribution histograms to identify bimodal patterns.
Recommended action:
Instrument slow request paths with detailed tracing to identify blocking operations causing tail latency. Set explicit latency SLOs for P95/P99. Implement request timeouts and circuit breakers to prevent cascading delays. Use load shedding for non-critical requests during spikes.