Apache Pulsar

Message Backlog Explosion Signals Consumer Lag or Failure

critical
Resource ContentionUpdated Jun 6, 2025

Growing message backlog indicates consumers are falling behind producers or have stopped processing messages entirely. This can lead to memory pressure, disk exhaustion, and eventually message loss if retention policies expire unacknowledged messages.

How to detect:

Monitor pulsar_subscription_back_log metric for sustained growth across subscriptions. Cross-reference with pulsar_consumer_msg_rate_out to confirm consumers are active but slow, or stopped entirely. Check pulsar_storage_backlog_size to assess disk pressure from unprocessed messages.

Recommended action:

Investigate consumer health: check consumer_unacked_messages for acknowledgment delays, verify consumer application logs for processing errors, scale consumer instances horizontally if throughput is insufficient, or adjust message processing logic to handle load. Consider increasing retention time if backlog represents legitimate catch-up work.