Eviction Policy Misconfiguration Causing OOM Errors

criticalIncident Response

Redis rejecting writes with OOM errors despite having eviction policy configured, typically due to volatile-* policy with keys lacking TTL.

Prompt: Our Redis is throwing OOM errors saying 'command not allowed when used memory > maxmemory' even though we have maxmemory-policy set to volatile-lru. We're not at 100% memory but writes are being rejected. Why isn't Redis evicting keys and how do I fix this without losing data?

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When Redis throws OOM errors with volatile-* eviction policies, the root cause is almost always keys without TTLs — these policies can't evict keys that lack expiration times. Start by checking the ratio of expiring keys to total keys, then verify memory pressure, and finally decide whether to add TTLs or switch to an allkeys-* policy based on your data model.

1Check the ratio of keys with TTLs to total keys
Calculate `redis.db.expires` / `redis.db.keys` — if this ratio is below 0.3, you've found your problem. The volatile-lru policy can only evict keys that have a TTL set, so if 70%+ of your keys are persistent (no expiration), Redis literally has nothing to evict and will reject writes even at 80-90% memory usage. Also check `redis.persist` to see the absolute count of keys without TTLs that are accumulating.
2Verify memory usage relative to maxmemory
Compare `redis.memory.used` to `redis.memory.maxmemory` to understand your actual headroom. Even if you're only at 85% memory, Redis will reject writes when it can't evict anything. This confirms you're hitting a real memory limit, not a misconfigured maxmemory setting. If maxmemory looks too low for your workload, increasing it is a temporary band-aid but won't fix the underlying TTL problem.
3Check if any evictions are happening at all
Look at `redis.keys.evicted` over time — if it's zero or near-zero while you're getting OOM errors, this definitively confirms that Redis can't find eligible keys to evict under your volatile-lru policy. If evictions are happening but at a low rate, it means only a small subset of keys are eligible and they're being churned while the bulk of your data (keys without TTLs) stays resident.
4Identify which keys lack TTLs and why
Use `redis-cli --scan` with `TTL` command to sample keys and find patterns in what's missing expirations. Often it's a specific code path or feature (sessions, job queues, certain cache types) where developers forgot to set TTLs. Understanding which keys are persistent tells you whether you can safely add TTLs or if you need to switch to allkeys-lru because some data is intentionally permanent.
5Choose between fixing TTLs or changing eviction policy
If all your data should be temporary (pure cache), add TTLs to keys using `EXPIRE` or `SET ... EX` and keep volatile-lru. If you have a mix of permanent and temporary data (like sessions that must not be evicted), you need to either separate them into different Redis instances or switch to allkeys-lru — but be warned that allkeys-lru can evict critical data like sessions, as noted in the redis-memory-eviction-session-loss insight. For job queues like BullMQ, use noeviction policy as recommended in the redis-oom-blocks-job-processing insight.
6Implement TTLs in application code and monitor
Update your application to set appropriate TTLs on all cached data — use `SET key value EX seconds` or `SETEX` instead of bare `SET` commands. Review your key naming patterns to encode intended lifetime (e.g., `session:*` = 24h, `cache:*` = 1h). After deployment, monitor the `redis.db.expires` / `redis.db.keys` ratio climbing above 0.7 and watch `redis.keys.evicted` start to increase as Redis can now properly evict old data, preventing future OOM errors.

Technologies

Related Insights

Relevant Metrics

Monitoring Interfaces

Redis Datadog
Redis Prometheus
Redis Native Metrics