Technologies/Apache Kafka/kafka.broker.config.default_replication_factor
Apache KafkaApache KafkaMetric

kafka.broker.config.default_replication_factor

Default replication factor config
Dimensions:None
Available on:DatadogDatadog (1)
Interface Metrics (1)
DatadogDatadog
Broker configuration for default replication factor.
Dimensions:None

About this metric

The kafka.broker.config.default_replication_factor metric exposes the configured default replication factor setting on Kafka brokers, which determines how many copies of a topic partition are maintained across the cluster when no explicit replication factor is specified during topic creation. This configuration parameter is critical for data durability and fault tolerance, as it directly impacts the cluster's ability to withstand broker failures without data loss. The metric reports as a gauge value representing the number of replicas that will be created by default, and this setting is typically configured in the broker's server.properties file as default.replication.factor. Understanding this value is essential for capacity planning, as higher replication factors consume more storage and network bandwidth but provide better resilience against hardware failures.

From an operational perspective, monitoring this metric helps ensure consistency across cluster configurations and validates that production environments maintain appropriate redundancy levels. For most production deployments, a healthy default replication factor is typically set to 3, which provides a good balance between data durability and resource utilization, allowing the cluster to tolerate up to two simultaneous broker failures while maintaining data availability. Values of 1 indicate no replication and should generally be avoided in production environments except for non-critical or ephemeral data, as they create single points of failure. Conversely, values exceeding 3 may be appropriate for mission-critical data but come with increased storage costs and replication overhead that can impact write throughput.

Common operational use cases include validating that broker configurations match expected deployment standards, particularly after configuration changes or during cluster migrations. Teams should alert on unexpected changes to this metric, as modifications could indicate unauthorized configuration drift or misconfigurations that could affect newly created topics. During troubleshooting scenarios, examining this metric alongside actual topic-level replication factors (which can override the default) helps identify whether under-replicated partitions stem from intentional topic configurations or cluster-wide defaults. The metric is particularly valuable when combined with monitoring of under-replicated partitions and in-sync replica (ISR) counts to assess overall cluster health and data resilience.