datafusion.join.total_time
Total join operation timeDimensions:None
Available on:Native (1)
Interface Metrics (1)
Sources
Technical Annotations (42)
Configuration Parameters (7)
datafusion.execution.hash_join_buffering_capacityrecommended: 0/sys/kernel/mm/transparent_hugepage/enabledrecommended: neverallow_symmetric_joins_without_pruningrepartition_joinsDATAFUSION_OPTIMIZER_REPARTITION_JOINSrecommended: trueDATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLDrecommended: 0DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLD_ROWSrecommended: 0CLI Commands (1)
cargo run --profile release-nonlto --bin dfbench tpcds --query 99 --iterations 3 --path benchmarks/data/tpcds_sf1 --query_path datafusion/core/tests/tpc-ds --prefer_hash_join truediagnosticTechnical References (34)
CoalesceBatchesStreamcomponentHashJoincomponentMutableArrayDatacomponentarrow_select::concatcomponentNestedLoopJoincomponentselectivityconceptIMDB benchmarkconceptCoalesceBatchesExeccomponentFilterExeccomponenttarget_partitionsconfiguration parameterRepartitionExeccomponentHashJoinExeccomponentradix treeconceptbuild sideconceptprobe sideconceptapply_join_filter_to_indicescomponentbuild_batch_from_indicescomponentNestedLoopJoinExeccomponentUnionArraycomponentbuild_row_join_batchcomponentScalarValue::to_array_of_sizecomponenthash joincomponentprobe operationconceptnext vectorcomponentchain lookupconceptColumnStatisticscomponentTable Statisticscomponentjoin cardinality estimationconceptfilter selectivityconceptTPC-Hconceptsymmetric joinconceptpartition pruningconceptCollectLeftcomponentPartitionedcomponentRelated Insights (13)
Hash join buffering disabled causes probe side to wait for complete build materializationinfo
▸
CoalesceBatches spends 17% of join execution time concatenating small filtered batcheswarning
▸
Batch splitting in joins may cause performance regressionwarning
▸
Nested loop join batch size fix causes performance regression on certain query patternswarning
▸
CoalesceBatchesExec placement after joins misses optimization opportunitiesinfo
▸
Join operations 4-10x slower than competitor on 1e8 row datasetswarning
▸
Hash join input order swap can cause 1000x performance degradationcritical
▸
NestedLoopJoin filter evaluation creates oversized intermediate batcheswarning
▸
Nested Loop Join performance degrades 45x with Union Array types in DataFusion 50critical
▸
Join performance degrades when build side contains non-unique valuesinfo
▸
Join disasters can occur without proper join reordering and statisticscritical
▸
Symmetric joins without pruning can cause performance degradationwarning
▸
Hash join optimizer selects non-partitioned mode causing 52x slower query executioncritical
▸