Aggregation job exceeds executor memory limits dropping records

critical

Resource ContentionUpdated Mar 24, 2026

Sources

Data Testing: Methods, Examples, and Techniquesdagster.io

Technologies:

Apache SparkThe root cause of this issue originates in Apache Spark

How to detect:

Large aggregation jobs drop records after hitting executor memory limits, causing data loss and incomplete analytical results.

Recommended action:

Increase executor memory allocation in Spark configuration (spark.executor.memory). Partition large jobs into smaller chunks using date ranges or key-based partitioning. Enable spill-to-disk for memory-intensive operations. Monitor executor memory usage metrics. Use incremental aggregation or pre-aggregation strategies. Tune spark.memory.fraction and spark.memory.storageFraction.