Linkerd

Hidden Retry Storm Degrading Latency

warning
latencyUpdated Jan 27, 2026

Linkerd proxies automatically retry failed requests, but misconfigured retry policies create cascading delays invisible to application monitoring. Retries add cumulative latency that doesn't appear in application logs since requests ultimately succeed.

How to detect:

Monitor for elevated P95/P99 latency without corresponding application errors. Use 'linkerd viz stat' to detect high retry rates (>5% retry percentage). Watch for discrepancies between route-level and backend-level success rates.

Recommended action:

Review service profile retry configurations. Adjust retry budgets and backoff policies. Use 'linkerd viz routes' to identify which routes have excessive retry rates. Consider implementing circuit breakers to prevent retry storms during outages.