Apache DataFusion

Hash join with empty build side reports zero output rows despite producing data

warning
performanceUpdated Mar 8, 2026(via Exa)
How to detect:

When a hash join operation has an empty build side, the fast-path optimization incorrectly reports 0 output rows in metrics even when rows are actually produced. This occurs because metrics tracking code was not updated after changes to how metrics are handled. Specifically affects LEFT ANTI JOIN and RIGHT ANTI JOIN operations where the build side is empty after filtering.

Recommended action:

When diagnosing hash join performance or validating query execution, verify output row counts against actual result sets rather than relying solely on HashJoinExec metrics. Check EXPLAIN ANALYZE output for discrepancies between reported output_rows and actual query results. For monitoring dashboards, implement additional validation of row count metrics for hash join operations, especially anti-joins with filtered build sides. The issue is fixed in PR #20810.