Executor Memory Pressure from Oversized Partitions
criticalSpark executors fail with OOM errors when processing partitions significantly larger than 200-500MB, exhausting executor heap memory and causing cascading failures across the cluster.
Monitor partition size distributions and executor memory utilization. Detect when spark_executor_memory_used approaches spark_executor_max_memory while spark_stage_disk_size_spilled increases and spark_executor_completed_tasks decreases with corresponding increases in spark_stage_count_failed_tasks.
Repartition data to target 200-500MB partitions before memory-intensive operations. Calculate optimal partition count by dividing total dataset size by target partition size. Use repartition early in pipelines and configure maxRecordsPerBatch for streaming workloads to limit partition sizes.