Technologies/Apache DataFusion/datafusion.join.build_input_rows
Apache DataFusionApache DataFusionMetric

datafusion.join.build_input_rows

Build side input rows
Dimensions:None
Available on:Native (1)
Interface Metrics (1)
Native
Number of rows received from the build side of a join
Dimensions:None

Technical Annotations (40)

Configuration Parameters (4)
DATAFUSION_OPTIMIZER_REPARTITION_JOINSrecommended: true
Forces optimizer to consider partitioned joins
DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLDrecommended: 0
Disables single partition threshold to force partitioned joins
DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLD_ROWSrecommended: 0
Disables row count threshold to force partitioned joins
collect_left_threshold
Memory limit threshold to determine if both join sides can fit in memory for dynamic reordering
Error Signatures (1)
output_rows=0log pattern
CLI Commands (4)
EXPLAIN ANALYZE SELECT * FROM t1 LEFT ANTI JOIN (SELECT * FROM t2 WHERE k <> 1) t2 ON t1.k = t2.kdiagnostic
SELECT * FROM lineitem, orders WHERE l_orderkey = o_orderkey AND o_orderkey = 1 AND l_quantity < (SELECT avg(l_quantity) FROM lineitem WHERE l_orderkey = o_orderkey);diagnostic
cargo run --profile release-nonlto --bin dfbench tpcds --query 99 --iterations 3 --path benchmarks/data/tpcds_sf1 --query_path datafusion/core/tests/tpc-ds --prefer_hash_join truediagnostic
datafusion-cli -c "select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and l_quantity < (select 0.2 * avg(l_quantity) from lineitem where l_partkey = p_partkey);"diagnostic
Technical References (31)
batch_sizeconfiguration parameterCoalesceBatchesExeccomponentBatchSplittercomponentBatchCoalescercomponentSortMergeJoincomponentHashJoincomponentHashJoinExeccomponentLEFT ANTI JOINconceptRIGHT ANTI JOINconceptfast-path optimizationconceptjoin parameterizationconceptpredicate pushdownconceptjoin orderingconceptcardinality estimationconceptTPC-Hconceptbuild sideconceptprobe sideconceptColumnStatisticscomponentTable Statisticscomponentjoin cardinality estimationconceptfilter selectivityconceptCollectLeftcomponentPartitionedcomponentTPC-H query 7conceptTPC-H query 21conceptTPC-H query 4conceptstar schemaconceptright deep treeconceptEXPLAIN ANALYZEcomponentRecordBatchcomponentequi join columnsconcept
Related Insights (10)
Join operators produce non-uniform batch sizes causing memory and performance issueswarning
Hash join with empty build side reports zero output rows despite producing datawarning
Join parameterization missing causes full table scans on selective queriescritical
Suboptimal join ordering on nested TPC-H querieswarning
Hash join input order swap can cause 1000x performance degradationcritical
Join disasters can occur without proper join reordering and statisticscritical
Hash join optimizer selects non-partitioned mode causing 52x slower query executioncritical
Suboptimal hash join build-side selection causes performance degradationwarning
Suboptimal join order causes 60% query performance degradation on multi-table joinswarning
Hash join build side must use smaller input to avoid memory pressure and slow probe phasewarning