Confusing Resource Metrics During Event Loop Blocking
warningFastAPI services experiencing event loop blocking show counterintuitive metrics: moderate CPU utilization (50-60%), healthy dependency performance, but rising tail latency and timeouts. This pattern indicates worker starvation rather than resource exhaustion.
Alert when p95/p99 latencies increase and request timeouts rise while CPU utilization remains moderate (<70%), database query times stay healthy, and cache hit rates remain normal. This divergence between resource availability and request completion suggests event loop blocking.
Instrument the event loop directly using structured logging around critical code sections to measure actual blocking time. Investigate endpoints with CPU-heavy operations (JSON normalization, signature verification) and third-party SDK calls for synchronous execution patterns. Use load tests that increase concurrency to reproduce throughput plateaus that confirm blocking.