JVM Memory Pressure GC Thrashing

critical

Resource ContentionUpdated Feb 23, 2026

DataHub GMS or Frontend experiencing memory pressure causing frequent garbage collection pauses, degrading API response times and potentially leading to OutOfMemoryErrors and service unavailability.

Sources

Monitoring DataHubdocs.datahub.com

Technologies:

DataHubSymptoms of this issue are visible in DataHub metrics and logs

How to detect:

Monitor jvm_memory_used approaching heap limits (>85% of max heap) combined with increasing jvm_gc_pause frequency and duration (process.runtime.jvm.gc.duration). Alert when GC pause times exceed thresholds (p95 > 1s) or when heap utilization remains elevated despite frequent GC cycles indicating memory leak or undersized heap.

Recommended action:

Analyze heap dumps to identify memory leaks (growing object counts, retained heap). Review recent changes that may have increased memory usage (larger batch sizes, new features). Scale DataHub pods vertically with increased heap size (JVM -Xmx flag) or horizontally to distribute load. Tune GC settings for lower pause times. Monitor jvm_threads_live and process.runtime.jvm.threads.count to rule out thread leaks consuming memory. Check for large GraphQL queries or bulk API operations causing temporary memory spikes.