FastAPIPrometheus

SLO Burn Rate Alert Pattern

warning
reliabilityUpdated Jul 21, 2023

SLO-based alerts on error budget burn rate provide early warning of degrading service health before complete failures. A burn rate >1 indicates the service is consuming error budget faster than sustainable.

How to detect:

Calculate error budget burn rate by comparing current error/latency rates against SLO targets. Alert when burn rate exceeds 1.0 over sliding time windows (1h for fast burn, 6h for slow burn). Track both success rate SLOs (99.9%) and latency SLOs (P99 < 250ms).

Recommended action:

Implement multi-window burn rate alerts that balance sensitivity and response time. Use 99.9% success rate and P99 latency targets as baseline SLOs. Configure Prometheus alert rules that fire on sustained burn rate violations. Review and adjust SLO targets based on actual business requirements and user impact.