Excessive Disk I/O from Shuffle Spill
Resource Contention
Spark executor memory exhaustion during shuffle operations causes data to spill to disk, dramatically slowing down jobs. High spark_executor_diskused during shuffle-heavy stages indicates memory-to-disk spill, which can be 10-100x slower than in-memory processing.
Databricks insight details requires a free account. Sign in with Google or GitHub to access the full knowledge base.
Sign in to access