datafusion.join.build_input_rows
Build side input rowsDimensions:None
Available on:Native (1)
Interface Metrics (1)
Sources
Technical Annotations (40)
Configuration Parameters (4)
DATAFUSION_OPTIMIZER_REPARTITION_JOINSrecommended: trueDATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLDrecommended: 0DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLD_ROWSrecommended: 0collect_left_thresholdError Signatures (1)
output_rows=0log patternCLI Commands (4)
EXPLAIN ANALYZE SELECT * FROM t1 LEFT ANTI JOIN (SELECT * FROM t2 WHERE k <> 1) t2 ON t1.k = t2.kdiagnosticSELECT * FROM lineitem, orders WHERE l_orderkey = o_orderkey AND o_orderkey = 1 AND l_quantity < (SELECT avg(l_quantity) FROM lineitem WHERE l_orderkey = o_orderkey);diagnosticcargo run --profile release-nonlto --bin dfbench tpcds --query 99 --iterations 3 --path benchmarks/data/tpcds_sf1 --query_path datafusion/core/tests/tpc-ds --prefer_hash_join truediagnosticdatafusion-cli -c "select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and l_quantity < (select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey);"diagnosticTechnical References (31)
batch_sizeconfiguration parameterCoalesceBatchesExeccomponentBatchSplittercomponentBatchCoalescercomponentSortMergeJoincomponentHashJoincomponentHashJoinExeccomponentLEFT ANTI JOINconceptRIGHT ANTI JOINconceptfast-path optimizationconceptjoin parameterizationconceptpredicate pushdownconceptjoin orderingconceptcardinality estimationconceptTPC-Hconceptbuild sideconceptprobe sideconceptColumnStatisticscomponentTable Statisticscomponentjoin cardinality estimationconceptfilter selectivityconceptCollectLeftcomponentPartitionedcomponentTPC-H query 7conceptTPC-H query 21conceptTPC-H query 4conceptstar schemaconceptright deep treeconceptEXPLAIN ANALYZEcomponentRecordBatchcomponentequi join columnsconceptRelated Insights (10)
Join operators produce non-uniform batch sizes causing memory and performance issueswarning
▸
Hash join with empty build side reports zero output rows despite producing datawarning
▸
Join parameterization missing causes full table scans on selective queriescritical
▸
Suboptimal join ordering on nested TPC-H querieswarning
▸
Hash join input order swap can cause 1000x performance degradationcritical
▸
Join disasters can occur without proper join reordering and statisticscritical
▸
Hash join optimizer selects non-partitioned mode causing 52x slower query executioncritical
▸
Suboptimal hash join build-side selection causes performance degradationwarning
▸
Suboptimal join order causes 60% query performance degradation on multi-table joinswarning
▸
Hash join build side must use smaller input to avoid memory pressure and slow probe phasewarning
▸