Technologies/Elasticsearch/elasticsearch.cluster.shards
ElasticsearchElasticsearchMetric

elasticsearch.cluster.shards

Shard distribution status
Dimensions:None
Available on:DatadogDatadog (5)PrometheusPrometheus (3)OpenTelemetryOpenTelemetry (1)DynatraceDynatrace (4)

Summary

Reports the total number of shards across the entire Elasticsearch cluster, including both primary and replica shards. This fundamental cluster health metric directly impacts resource consumption and cluster stability. Excessive shard counts (particularly beyond 20-30 shards per GB of heap as documented in the "Excessive Shard Count Degrades Performance" insight) cause master node overhead, slower cluster state updates, and degraded performance requiring shard consolidation or reindexing strategies.

Interface Metrics (13)
DatadogDatadog
The number of shards that are relocating from one node to another.
Dimensions:None
PrometheusPrometheus
The number of shards that are currently moving from one node to another node.
Dimensions:None
PrometheusPrometheus
The number of primary shards in your cluster. This is an aggregate total across all indices.
Dimensions:None
DatadogDatadog
The number of shards that are currently initializing.
Dimensions:None
PrometheusPrometheus
Count of shards that are being freshly created.
Dimensions:None
OpenTelemetryOpenTelemetry
The number of shards in the cluster.
Dimensions:None
DynatraceDynatrace
Number of relocating shards
Dimensions:None
DatadogDatadog
The number of shards that are unassigned to a node.
Dimensions:None
DynatraceDynatrace
Number of delayed unassigned shards
Dimensions:None
DynatraceDynatrace
Number of initializing shards
Dimensions:None
DatadogDatadog
The number of active primary shards in the cluster.
Dimensions:None
DynatraceDynatrace
Number of unassigned shards
Dimensions:None
DatadogDatadog
The number of shards whose allocation has been delayed [v2.4+].
Dimensions:None

Technical Annotations (2)

Technical References (2)
shard allocationcomponentDecision.Typecomponent
Related Insights (9)
Unassigned Shards Blocking Recoverycritical

Missing or unassigned shards indicate data unavailability and can degrade cluster performance. Red cluster status means primary shards are unallocated, causing data loss risk.

Excessive Shard Count Degrades Performancecritical

Too many shards consume cluster resources even when idle, causing slow queries and increased overhead. Rule of thumb: keep shards below 20 per GB of heap configured.

Disk Watermark Shard Relocation Stormwarning

When any node crosses the low disk watermark (85% full by default), Elasticsearch starts relocating shards. Multiple nodes hitting watermarks simultaneously can trigger cascading relocations that overload cluster I/O and delay recovery.

Unassigned Shard Red Cluster Spiralcritical

When primary shards cannot be assigned (elasticsearch.cluster.health == 2), data becomes unavailable and cluster enters red state. This occurs from insufficient nodes, misconfigured shard allocation rules, or node failures during insufficient replica coverage.

Excessive Shard Count Degrading Performancecritical

Too many shards consume cluster resources even when idle, causing slow queries, increased overhead, and reduced stability. Rule of thumb: keep shards below 20 per GB of heap configured.

Snapshot Lifecycle Policy Failurescritical

Failed snapshot operations risk data loss and violate backup SLAs. SLM policy failures can result from repository unavailability, insufficient permissions, or storage quota issues.

Node Role Imbalance Causing Hotspotswarning

Improper distribution of shards or unbalanced node roles can cause resource hotspots where some nodes are overloaded while others are underutilized.

Index Recovery Prolonged Durationwarning

Slow shard recovery after node failures or restarts delays cluster stabilization and can indicate network issues, disk I/O bottlenecks, or configuration problems.

Allocation decision serialization BWC issue may affect mixed-version clusterswarning