Low cardinality aggregates benefit from partial/final mode while high cardinality suffers
infoPartial+Final aggregate mode performs well for low cardinality (e.g., 4 distinct groups across 2M rows) because hash tables remain small and final shuffle is minimal. However, this mode degrades for high cardinality (millions of groups) due to duplicate hashing and row conversions. Single mode shows opposite behavior: slight regression for low cardinality but 30x improvement for high cardinality.
Implement adaptive aggregate mode selection based on runtime cardinality statistics. For low cardinality (<1000 groups), use Partial+Final for parallelism. For high cardinality (>100K groups), use SinglePartitioned or skip partial aggregation. Monitor datafusion.aggregate.groups during execution to make dynamic decisions. Consider DuckDB's approach of consolidating aggregate and repartition operators for adaptive algorithm selection.