Proxy Cache Hit Ratio Optimization
infoCost Optimization
Improving NGINX proxy cache hit ratio to reduce backend load and improve response times through better cache configuration.
Prompt: “My NGINX proxy cache hit ratio is only 60% and my backends are getting hammered with requests. How should I tune proxy_cache_valid, cache keys, and cache locking to improve cache efficiency and reduce origin load?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When optimizing NGINX proxy cache hit ratios, start by analyzing the breakdown of hits, misses, bypasses, and expired responses to identify whether the issue is fragmented cache keys, short TTLs, or unnecessary bypasses. Then systematically tune your cache key to only vary on response-affecting parameters, increase proxy_cache_valid TTLs based on content update patterns, ensure adequate cache size to prevent premature eviction, and enable cache locking to prevent thundering herd issues.
1Analyze cache effectiveness metrics to identify the root cause
Start by examining `nginx-cache-hit-responses`, `nginx-cache-miss-responses`, `nginx-cache-bypass-responses`, and `nginx-cache-expired-responses`. Calculate the percentage of each type — if bypasses are >10%, you have config issues; if expired responses are >20% of hits, your TTLs are too short; if misses are high but `nginx-cache-miss-responses-write` is also high, you have cache key fragmentation creating too many unique entries. This breakdown tells you exactly where to focus your tuning efforts.
2Audit your proxy_cache_key for unnecessary variation
Check your `proxy_cache_key` directive — the default `$scheme$proxy_host$request_uri` includes all query parameters, which fragments the cache unnecessarily. If you're caching API responses where certain query params don't affect output (tracking IDs, timestamps, session tokens), or including cookies that don't vary the response, you're creating unique cache entries for identical content. The `no-request-caching` insight confirms that identical requests should reuse cached responses — review your cache key to include only response-affecting variables.
3Optimize proxy_cache_valid directives based on expiration patterns
Compare `nginx-cache-expired-responses` to `nginx-cache-hit-responses` — if expired is more than 20% of your hits, you're expiring content too aggressively. Review your `proxy_cache_valid` settings and increase TTLs for status codes that can be cached longer. Don't rely solely on backend Cache-Control headers if they're overly conservative; use `proxy_ignore_headers` selectively and set appropriate TTLs based on actual content update frequency.
4Verify cache size isn't causing premature eviction
Check if `nginx-cache-size` is approaching `nginx-cache-max-size`. If you're above 80% capacity, NGINX is evicting entries via LRU before they expire naturally, tanking your hit ratio. Size your cache to hold your working set — look at the rate of `nginx-cache-miss-responses-write` to estimate how much new content you're adding per hour, multiply by your average TTL, and ensure max_size accommodates that volume.
5Enable proxy_cache_lock to prevent thundering herd
Without `proxy_cache_lock`, when a popular resource expires or is evicted, multiple concurrent requests will all slam the backend simultaneously. Enable it with `proxy_cache_lock on` and set `proxy_cache_lock_timeout 5s` to ensure only one request populates the cache while others wait. Monitor `nginx-upstream-peers-requested` for spikes that correlate with cache expirations — cache locking should smooth these out and dramatically reduce backend load.
6Review and minimize cache bypass conditions
High `nginx-cache-bypass-responses` means requests are skipping the cache entirely and hitting your backends. Review your `proxy_cache_bypass` directives — common culprits include bypassing all requests with cookies (when only auth cookies matter), bypassing based on request method too broadly, or bypassing due to upstream headers. Only bypass when the response truly shouldn't be cached; eliminating unnecessary bypasses will immediately reduce backend load.