Technologies/Apache Kafka/kafka.broker.count
Apache KafkaApache KafkaMetric

kafka.broker.count

Total number of brokers in the cluster.
Dimensions:None
Available on:DatadogDatadog (1)
Interface Metrics (1)
DatadogDatadog
Total number of brokers in the cluster.
Dimensions:None

About this metric

The kafka.broker.count metric measures the total number of active brokers currently registered and participating in a Kafka cluster at any given time. This metric serves as a critical health indicator for cluster topology and capacity, providing immediate visibility into whether the cluster is operating with its expected number of nodes. As a gauge metric, it reflects the instantaneous count of brokers that are connected to Apache ZooKeeper (for older Kafka versions) or participating in the KRaft consensus protocol (for newer versions), making it essential for monitoring cluster membership and detecting unexpected broker failures or network partitions.

From an operational perspective, this metric is fundamental for ensuring high availability, data durability, and overall cluster reliability. Kafka's replication mechanism depends on having sufficient brokers to maintain the configured replication factor for topics. A sudden decrease in broker count can indicate catastrophic failures, planned or unplanned maintenance issues, or infrastructure problems that may compromise data availability and lead to under-replicated partitions. Operators should establish baseline expectations for this metric based on their cluster design—for example, a production cluster designed with six brokers should consistently report a count of six. Common alerting strategies include triggering critical alerts when the broker count drops below the minimum required to maintain replication guarantees (typically less than the configured replication factor) or warning alerts when any deviation from the expected count occurs. This metric is often the first indicator consulted during troubleshooting scenarios involving under-replicated partitions, increased consumer lag, or degraded cluster performance, as broker loss directly impacts partition leadership distribution and available cluster capacity.

The kafka.broker.count metric also plays a vital role in capacity planning and cost management decisions. Understanding historical patterns in broker count alongside throughput and resource utilization metrics helps teams determine when to scale clusters horizontally by adding brokers or when to decommission underutilized capacity. For organizations running Kafka in cloud environments or using managed services like Amazon MSK or Confluent Cloud, maintaining optimal broker count directly affects infrastructure costs while ensuring sufficient headroom for traffic spikes and fault tolerance requirements.