SLOWLOG Analysis for Performance Optimization

warningProactive Health

Analyzing Redis SLOWLOG to identify inefficient commands and operations that are degrading overall performance and increasing latency.

Prompt: “Our Redis P95 latency has crept up from 5ms to 25ms over the past few weeks. I want to check SLOWLOG to see if there are specific commands causing problems. What should I look for in SLOWLOG output and what are common culprits? How do I determine if these slow commands are from our application or if Redis itself is struggling?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When Redis P95 latency degrades, start by examining the slowlog metrics to confirm the issue is command-level, then identify which specific commands are appearing in the slowlog. Focus on O(N) operations like KEYS, SMEMBERS, and HGETALL that become expensive at scale, and distinguish between high-frequency moderate-latency commands versus rare high-latency ones by looking at total CPU time consumed.

1Check slowlog accumulation and latency distribution

Start with `redis.slowlog.length` and `redis.slowlog.micros.95percentile` to confirm the slowlog is actually growing and latency is degrading. If `redis.slowlog.length` is increasing over time and `redis.slowlog.micros.95percentile` has jumped from ~5000 microseconds to ~25000 microseconds, you have confirmed the issue is at the command level. If slowlog is empty or stable, the latency issue is likely network, persistence, or system-level rather than specific commands.

Slow Query Backlog Masks Redis Connection Pool Exhaustion redis.slowlog.lengthredis.slowlog.micros.95percentileredis.slowlog.micros.count

2Query the slowlog for command patterns

Run `SLOWLOG GET 100` to retrieve the actual slow commands and look for patterns — are the same command types appearing repeatedly? Are they coming from specific keys or key patterns? The slowlog shows the command, arguments, timestamp, and execution time, which tells you exactly what Redis considers slow. Group entries by command type (GET, SET, HGETALL, KEYS, etc.) to identify which operations are problematic.

redis.slowlog.length

3Identify expensive O(N) operations

Check `redis.commands.usec` and `redis.commands.usec_per_call` for commands like KEYS, SMEMBERS, HGETALL, LRANGE, and ZRANGE — these have O(N) complexity and become expensive on large data structures. If you see KEYS in production, that's an immediate red flag; replace it with SCAN. If SMEMBERS, HGETALL, or ZRANGE are showing high `usec_per_call` values (>10,000 microseconds), they're likely running on large collections and should be replaced with SSCAN, HSCAN, ZSCAN or paginated with limits.

Command Latency Spikes from Expensive O(N) Operations redis.commands.usecredis.commands.usec_per_call

4Analyze total CPU impact: frequency vs latency

Compare `redis.command.calls` against `redis.commands.usec` to find the worst offenders by total time impact. A command called 100,000 times with 500μs average latency (50 seconds total CPU) is far worse than a command called 10 times with 100ms latency (1 second total). Sort commands by `redis.commands.usec` descending to see which operations are consuming the most Redis CPU time overall, not just the slowest individual calls.

Command Latency Spikes from Expensive O(N) Operations redis.command.callsredis.commands.usecredis.commands.usec_per_call

5Check for connection pool exhaustion symptoms

If `redis.slowlog.length` is trending upward alongside increasing blocked clients or connection pool warnings in your application logs, the slow commands may be exhausting your connection pool and causing cascading failures. This pattern — slowlog growing + connections being held — indicates operations are blocking application threads waiting for Redis responses. You may need to increase connection pool size, add timeouts, or move expensive operations to a dedicated Redis instance to isolate impact.

Slow Query Backlog Masks Redis Connection Pool Exhaustion redis.slowlog.lengthredis.slowlog.micros.95percentile

6Correlate slowlog spikes with persistence operations

Check if slowlog entries spike during Redis background save operations (RDB snapshots or AOF rewrites). If `redis.slowlog.micros.95percentile` spikes correlate with persistence windows, Redis may be fork-blocking or experiencing memory pressure during saves. This is environmental, not application code, but explains the latency pattern. Consider adjusting persistence frequency, using AOF instead of RDB, or scaling to a larger instance with more memory to handle fork overhead.

Slow Query Backlog Masks Redis Connection Pool Exhaustion redis.slowlog.micros.95percentile