Aggregation operations under-partition causing multi-fold performance degradation
warningperformanceUpdated Mar 10, 2026(via Exa)
Technologies:
How to detect:
Aggregation operations default to Final (single partition) mode when FinalPartitioned would provide better performance. TPC-DS query 4 shows 1.95x speedup and TPC-H query 3 shows 1.10x speedup when forced to partitioned mode. Issue becomes significant when accumulating more than 200k groups.
Recommended action:
Monitor datafusion.aggregate.groups metric. When group count exceeds 200k, consider forcing partitioned aggregation by reviewing optimizer settings. Examine query plans for Final mode aggregations on high-cardinality group-by operations. Dynamic switching proposed but not yet implemented.