Elasticsearch Memory Pressure from Unindexed Schema Fields

critical

Resource ContentionUpdated Oct 8, 2025

DataHub's search performance degrades when Elasticsearch cluster memory is exhausted by unoptimized index mappings or excessive schema fields. High heap usage combined with slow search queries indicates index buffer or field data cache exhaustion.

Sources

Monitoring DataHubdocs.datahub.com

DataHub Performance Optimizationsupport.datahub.com

Technologies:

DataHubSymptoms of this issue are visible in DataHub metrics and logs

ElasticsearchThe root cause of this issue originates in Elasticsearch

How to detect:

Monitor jvm_memory_used (Elasticsearch heap) approaching 85%+ of limits. Correlate with elasticsearch_index operation latency spikes and search query timeouts. Check thread_pool.write.queue_size and thread_pool.search.queue_size for rejections.

Recommended action:

Optimize Elasticsearch indices.memory.index_buffer_size (default 10%, increase to 30% for write-heavy workloads). Scale Elasticsearch replicas horizontally. Review schema metadata ingestion to reduce field cardinality. Enable SSD storage for better I/O. Increase ES_JAVA_OPTS heap to 50% of container memory. Add indexes on frequently-queried fields.