Hints Accumulation Signals Silent Replica Failure
criticalGrowing cassandra_storage_count_hints indicates a replica node is unreachable or unable to accept writes, forcing coordinators to buffer writes temporarily. Persistent growth during apparent cluster health suggests network partitions or zombie nodes.
Alert when cassandra_storage_count_hints rises continuously without corresponding node down events, or when cassandra_storage_count_hints_in_progress remains elevated. Cross-reference with nodetool status showing all nodes UN (Up/Normal).
Check network connectivity between coordinator and replica nodes. Verify suspected down node isn't actually unreachable due to firewall rules or network partition. If node appears up but accumulates hints, investigate gossip protocol issues or JMX port accessibility (default 7199).