Monitor Store Growth Causes Query Delays
warningResource ContentionUpdated Jun 4, 2024
When the monitor store (RocksDB database) grows excessively large (default warning at 15GB), monitor queries slow down, potentially causing client timeouts and leader election delays. If /var partition fills completely, monitors terminate.
Sources
How to detect:
Check `ceph health detail` for 'store is getting too big' warnings. Inspect monitor store size at /var/lib/ceph/mon-<hostname>/store.db. Alert when store exceeds 15GB or /var partition usage exceeds 80%.
Recommended action:
Use `ceph-monstore-tool` to compact the store - never manually delete monitor data. Increase mon_data_avail_warn threshold if legitimate. If store corruption occurs ('Corruption: error in middle of record'), follow monitor recovery procedures or replace the monitor.