DNS Resolution Latency Spikes Under Concurrent Load
warninglatencyUpdated Feb 9, 2026
High concurrent query volume exceeds forward plugin's max_concurrent setting, causing queries to queue and response times to spike, particularly impacting applications with aggressive retry logic.
Sources
How to detect:
Monitor coredns_forward_request_time_seconds p95 and p99 percentiles. Alert when latency exceeds 100ms or when coredns_forward_max_concurrent_rejects increases, indicating queries are being dropped due to concurrency limits.
Recommended action:
Increase max_concurrent setting in forward plugin from default to 1000-5000 based on cluster size. Enable aggressive cache prefetching (prefetch 5 60s 20%) to reduce upstream queries. Scale CoreDNS horizontally to distribute concurrent load. Monitor coredns_forward_sockets_open to track connection pooling efficiency.