Ingress Controller Performance Under High Traffic

warningCapacity Planning

Determine if NGINX Ingress Controller is the bottleneck for application latency under high load.

Prompt: “We're seeing increased response times during peak traffic and I think the NGINX Ingress Controller might be overwhelmed — how do I tell if we need to scale up ingress replicas or tune the configuration?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When diagnosing NGINX Ingress Controller performance under high traffic, start by checking if the ingress pods themselves are resource-saturated, then verify traffic is distributed evenly across replicas. Before scaling up ingress, rule out backend saturation showing up as queue buildup, and check for network errors or bandwidth saturation that might indicate configuration issues rather than capacity problems.

1Check ingress controller resource saturation

First, look at `kubernetes_cpu_usage` and `kubernetes_memory_usage` for your NGINX ingress controller pods and compare them to `kubernetes_cpu_limits` and `kubernetes_memory_limits`. If CPU is consistently above 80% of limits during peak traffic or memory is approaching limits, you've found your bottleneck and need to either increase resource limits or scale horizontally. If resources are well below limits (say, 40-60% CPU), the problem is likely elsewhere—don't just throw more replicas at it.

kubernetes_cpu_usagekubernetes_memory_usagekubernetes_cpu_limitskubernetes_memory_limits

2Verify traffic distribution across ingress pods

Check your Ingress resource definitions for wildcard host configurations (e.g., `host: '*'`) and compare `kubernetes_network_rx_size` across ingress pods to see if one pod is handling disproportionate traffic. The `single-ingress-container-overload-during-traffic-spikes` insight shows this is a common misconfiguration where all traffic hits a single container despite having multiple replicas, creating a single point of failure. If you see one pod receiving 80%+ of traffic while others are idle, you need to fix your Ingress rules to use specific hosts rather than wildcards.

Single Ingress Container Overload During Traffic Spikes kubernetes_network_rx_size

3Distinguish ingress saturation from backend saturation

This is critical and often misdiagnosed: check your backend application pods' `kubernetes_cpu_usage` alongside ingress metrics. If backend pods show moderate CPU (40-70%) while response times degrade, you're likely hitting backend concurrency limits from event loop blocking, not ingress capacity issues. The `load-balancer-queue-buildup-from-backend-concurrency-limits` insight explains how NGINX queues requests when backends appear busy despite having CPU headroom. In this case, scaling ingress won't help—you need to address backend concurrency or add backend replicas instead.

Load Balancer Queue Buildup from Backend Concurrency Limits kubernetes_cpu_usage

4Check network bandwidth and error rates

Look at `kubernetes_network_rx_size` and `kubernetes_network_transaction_size` to see if you're hitting network bandwidth limits (typically 1-10 Gbps depending on your setup), and check `kubernetes_network_errors` during peak periods. High error rates (>1% of connections) when you're not at bandwidth limits often indicate configuration issues like insufficient worker connections, low keepalive settings, or timeout mismatches. These are tuning problems, not capacity problems—scaling won't fix them.

kubernetes_network_rx_sizekubernetes_network_transaction_sizekubernetes_network_errors

5Rule out pod churn as the root cause

Check if latency spikes and `kubernetes_network_errors` correlate with deployment events or pod restarts rather than traffic volume itself. The `network-error-rate-spike-during-pod-churn` insight shows connection failures often spike when pod IPs change during rollouts, as traffic continues hitting terminating pods. If your errors happen during deployments regardless of traffic level, you need proper preStop hooks with 10-30 second delays and connection draining configured, not more ingress capacity.

Network Error Rate Spike During Pod Churn kubernetes_network_errors