Apache HTTP Server

Apache Rate Limit Breach from Concurrent Operations

warning
Resource ContentionUpdated Feb 11, 2026

When Apache handles bursts of concurrent requests (e.g., 10+ beams with 100k+ token contexts), rate limits are exceeded despite average throughput being within tier limits, causing 429 errors and workflow friction.

How to detect:

Detect when apache_net_request_per_s spikes above tier RPM limits divided by 60, or when apache_workers approaches max capacity while request rate remains high. Monitor for 429 response patterns in access logs coinciding with high apache_current_backend connections.

Recommended action:

Implement client-side rate limiting with token bucket algorithm tracking requests over rolling 60-second windows. Add 10% safety margin (0.9 multiplier) to configured limits. Space requests evenly rather than bursting to maximize effective throughput within rate limits.