← All categories
Reliability Topology & Dependency Risk
Redundancy, failover, and failure propagation across interacting systems.
1. What is the blast radius if this dependency becomes unavailable?
Map the blast radius of a single-component failure: which services are affected, whether failover is automatic, what retry/circuit-breaker/timeout behavior applies, and what the expected degradation looks like.
2. Which services have no redundancy or are deployed in a single availability zone?Identify services with no redundancy or single-availability-zone deployment; include deployment topology and dependency context.
3. Are failover triggers, health checks, and auto-scaling policies producing the intended outcome?Validate that failover triggers, health checks, and auto-scaling policies produce the intended outcome under realistic conditions — not just that they exist in configuration.