Rate Limiting Configuration for DDoS Protection
criticalIncident Response
Configuring NGINX rate limiting to protect against DDoS attacks and prevent upstream server overload from excessive requests.
Prompt: “We're experiencing what looks like a DDoS attack with massive request spikes. How should I configure NGINX rate limiting (limit_req) with the right burst and nodelay settings to protect my backends without blocking legitimate users?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When facing a DDoS attack, first confirm abnormal request patterns and assess backend saturation, then test rate limiting in dry-run mode before implementing. Start conservative with rate limits and use burst buffers to protect legitimate traffic while shedding attack volume. Monitor worker saturation signals (499 errors, queue buildup) to validate your configuration is protecting backends without over-blocking.
1Confirm abnormal request patterns and baseline normal traffic
Check `nginx_net_request_per_s` and `nginx_server_zone_requested` to quantify the spike. Compare current rates to your baseline — if you normally handle 1K req/s and you're seeing 10K+ req/s, you have a clear attack pattern. Look at the distribution across endpoints and source IPs in your access logs to distinguish attack traffic from legitimate load spikes (like a product launch).
2Assess backend saturation and worker exhaustion
Check `nginx_server_zone_processing` to see how many requests are actively queued — sustained high values mean backends are overwhelmed. Look at `nginx_server_zone_discarded` for requests dropped without responses, which indicates severe saturation. The `requests-queue-in-backlog-worker-saturation` insight explains how requests queue when workers are busy, causing end-to-end latency to spike even when backend processing time looks normal.
3Check for 499 status codes indicating client timeouts
Search your nginx error logs for 499 status codes, which appear when clients close connections before receiving responses — a classic sign of worker saturation. As the `worker-saturation-499-errors` insight notes, when workers are saturated with slow requests, clients timeout and nginx logs 499s. High 499 counts confirm you need rate limiting to shed load before it reaches backends.
4Test rate limiting in dry-run mode first
Configure `limit_req_dry_run on` with a conservative rate (e.g., 10r/s per IP) and monitor `nginx_limit_requested_rejected_dry_run` and `nginx_limit_requested_delayed_dry_run`. This shows you what would be blocked/delayed without actually affecting production traffic. Use these metrics to tune your rate and burst values before enforcing — you want to block attack traffic while minimizing false positives on legitimate users.
5Implement rate limiting with burst and nodelay tuned to your traffic
Based on dry-run results, configure `limit_req` with a rate that allows normal traffic (e.g., 20r/s per IP) and set `burst=10` to handle legitimate bursts like page loads with multiple assets. Use `nodelay` to reject excess requests immediately rather than queueing them, which prevents attacker requests from consuming worker slots. The `missing-timeouts-rate-limits` insight emphasizes that without these controls, slow clients and attack traffic will exhaust resources and degrade service for everyone.
6Monitor effectiveness and adjust thresholds
After enforcing rate limits, watch `nginx_server_zone_processing` to confirm queued requests decrease and `nginx_net_request_per_s` stabilizes at sustainable levels. Check that 499 errors and `nginx_server_zone_discarded` counts drop, indicating backends are no longer saturated. If you see legitimate traffic being blocked (check application error rates), raise the rate or burst slightly. Rate limiting is iterative — tune based on attack patterns and legitimate usage.
Technologies
Related Insights
Missing request timeouts and rate limits allow resource exhaustion
warning
Worker saturation manifests as nginx 499 status codes
warning
Request Queue Buildup Indicates Worker Exhaustion
critical
When NGINX reaches MaxRequestWorkers (or event MPM's equivalent capacity), new requests queue at the load balancer level, causing gateway timeouts even when CPU and dependencies appear healthy. This often manifests as moderate CPU (~50-60%) but rising tail latency and 502/504 errors.
Requests queue in connection backlog when workers are saturated
warning