Apache Solr

Indexing Throughput Bottleneck from Disk I/O Saturation

warning
Resource ContentionUpdated Apr 17, 2024

Slow indexing rates accompanied by high disk I/O utilization indicate storage bottleneck. Merge operations compete with indexing writes for disk bandwidth, creating throughput ceiling and potential query latency during merges.

How to detect:

Monitor indexing rate over time to establish baseline. Watch for sustained disk I/O at 100% utilization. Correlate indexing slowdowns with merge time metrics. Check commit times - lengthy commits blocking other operations indicate I/O saturation.

Recommended action:

Migrate to SSD storage for Solr data directories to dramatically improve I/O performance. Adjust mergeFactor in solrconfig.xml to reduce merge frequency (creates more segments but reduces merge overhead). Batch document updates to reduce I/O operations.