ray_cluster_active_nodes
Active nodes on the clusterDimensions:None
Available on:
Datadog (1)
Interface Metrics (1)
Related Insights (2)
Ray Client Connection Timeout Stormcritical
Ray client connections repeatedly timeout when connecting to the cluster, indicating either network unreachability, gRPC configuration issues, or cluster head node unavailability.
▸
Ray Telemetry High-Cardinality Cost Explosionwarning
Attaching Kubernetes pod-level attributes (k8s.pod.id, k8s.node.ip) to Ray metrics dramatically increases cardinality and observability costs, especially in autoscaling environments where pods are ephemeral.
▸