← All categories

Reliability Topology & Dependency Risk

Redundancy, failover, and failure propagation across interacting systems.

1. What is the blast radius if this dependency becomes unavailable?

Map the blast radius of a single-component failure: which services are affected, whether failover is automatic, what retry/circuit-breaker/timeout behavior applies, and what the expected degradation looks like.

2. Which services have no redundancy or are deployed in a single availability zone?

Identify services with no redundancy or single-availability-zone deployment; include deployment topology and dependency context.

3. Are failover triggers, health checks, and auto-scaling policies producing the intended outcome?

Validate that failover triggers, health checks, and auto-scaling policies produce the intended outcome under realistic conditions — not just that they exist in configuration.