Memory Eviction Causing Database Thundering Herd

criticalIncident Response

Redis hitting maxmemory limits triggers aggressive evictions, causing cache misses that overwhelm the backend database with simultaneous requests.

Prompt: My Redis cache is evicting keys rapidly and our database is getting hammered with queries. I'm seeing 10x normal database load and response times are terrible. Help me diagnose if this is a memory pressure issue in Redis and whether I need to scale up or tune eviction policies.

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When diagnosing Redis memory eviction causing database overload, start by confirming memory pressure (used memory vs maxmemory limit and eviction rate), then quantify the cache effectiveness degradation (hit rate drop), and correlate eviction timing with database load spikes to confirm the thundering herd pattern. Finally, review your eviction policy configuration and key TTL distribution to determine whether you need to scale Redis memory or optimize your caching strategy.

1Confirm Redis is hitting memory limits
Check `redis-memory-used` against `redis-memory-maxmemory` — if used memory is consistently above 90% of maxmemory, you're in eviction territory. Look at the `redis-keys-evicted` rate over the past hour: a sudden spike or sustained high rate (thousands per second for a busy instance) confirms active eviction pressure. The `high-eviction-rate-indicates-memory-pressure` insight tells us this is the root cause when memory approaches limits.
2Quantify cache effectiveness degradation
Calculate your cache hit rate: `redis-keyspace-hits` / (`redis-keyspace-hits` + `redis-keyspace-misses`). A healthy Redis cache typically runs 90%+ hit rate; if you're seeing it drop to 70% or below during eviction spikes, you're losing your hot data. Compare current hit rate to your baseline from before the issue started — a 20-30% drop in hit rate directly correlates with the database hammering you're experiencing.
3Correlate eviction spikes with database load
Plot `redis-keys-evicted` rate alongside your database query rate and latency metrics on the same timeline. If you see database queries spike immediately after Redis eviction bursts (within seconds), you've confirmed the thundering herd: cache misses are causing simultaneous backend queries. The `database-bottleneck-without-caching` insight explains why this creates the 10x load you're seeing — without caching, every request hits the database directly.
4Review your eviction policy configuration
Check your Redis maxmemory-policy setting (usually in redis.conf or via CONFIG GET maxmemory-policy). If you're using allkeys-lru but storing sessions or critical data, you're evicting important keys randomly — the `redis-memory-eviction-session-loss` insight warns about this. For mixed workloads, volatile-lru (only evict keys with TTL) is often safer. If you're already on volatile-lru and still seeing aggressive eviction, you likely need more memory rather than policy tuning.
5Analyze key TTL distribution and access patterns
Look at which keys are being evicted — are they your hottest data (frequently accessed) or stale entries? Compare `redis-net-commands` volume to eviction rate: if you're processing millions of commands but evicting thousands of keys per second, your working set is too large for available memory. Review TTL settings on your most frequently accessed keys — overly long TTLs on low-value data can crowd out hot cache entries.
6Determine scaling vs optimization path
If you have physical memory available on your Redis host, increasing maxmemory is the fastest fix — the `high-eviction-rate-indicates-memory-pressure` insight recommends this when sub-90% memory utilization isn't achievable through tuning. However, if memory is constrained or evictions resume quickly after scaling, focus on optimization: reduce TTLs on less-critical data, implement tiered caching, or shard across multiple Redis instances. The goal is keeping your working set under 80% of maxmemory with headroom for traffic spikes.
7Verify graceful degradation handling
While fixing the root cause, ensure your application isn't making the problem worse. The `cache-failure-fallback` insight warns that cache errors should degrade to database queries, not throw 500 errors. Check your application logs for Redis exceptions during eviction storms — if you're seeing cascading failures rather than slower-but-working database fallback, your error handling needs hardening to survive cache pressure events.

Technologies

Related Insights

Relevant Metrics

Monitoring Interfaces

Redis Datadog
Redis Prometheus
Redis Native Metrics