Apache DataFusionApache Spark

Distributed system logs not collected from all workers

warning
availabilityUpdated Feb 22, 2024(via Exa)
How to detect:

In distributed systems like Spark, logs from executor nodes and workers are not centrally collected, making it impossible to diagnose failures that occur on specific executors or to understand the full execution flow.

Recommended action:

Configure centralized log collection for distributed systems. For Spark, set up log4j2 configuration to capture logs from all workers and executors. Ensure the log aggregation system collects from all nodes in the cluster, not just the driver/master node.