Excessive Disk I/O from Shuffle Spill

Resource Contention

Spark executor memory exhaustion during shuffle operations causes data to spill to disk, dramatically slowing down jobs. High spark_executor_diskused during shuffle-heavy stages indicates memory-to-disk spill, which can be 10-100x slower than in-memory processing.

Databricks insight details requires a free account. Sign in with Google or GitHub to access the full knowledge base.