Translog Accumulation Risk
warningTransaction log (translog) accumulates uncommitted operations between flushes. Excessive translog size increases recovery time after node failures and can indicate flush problems or configuration issues.
elasticsearch.node.translog.uncommitted.size growing significantly (>1GB per shard) or elasticsearch.node.translog.operations count very high without corresponding flush operations
Check flush operation frequency via _nodes/stats API. Default flush triggers at 512MB translog size or 30-minute interval (index.translog.flush_threshold_size and index.translog.sync_interval). If translog growing beyond threshold: (1) Verify flush operations completing successfully via logs, (2) Check disk I/O capacity - slow disk prevents timely flushes, (3) Review index.translog.durability setting (request vs async) - async improves performance but risks data loss on crash. For large bulk loads, consider temporarily increasing flush_threshold_size, then reset after completion. Monitor recovery time after node restart - long recovery correlates with large translog size.