Technologies/Apache DataFusion/datafusion.join.total_time
Apache DataFusionApache DataFusionMetric

datafusion.join.total_time

Total join operation time
Dimensions:None
Available on:Native (1)
Interface Metrics (1)
Native
Time spent performing join operations
Dimensions:None

Technical Annotations (42)

Configuration Parameters (7)
datafusion.execution.hash_join_buffering_capacityrecommended: 0
Default disabled (0); set > 0 for I/O-bound queries but may degrade dynamic filter performance
/sys/kernel/mm/transparent_hugepage/enabledrecommended: never
Transparent huge pages cause ~18% overhead via page faults in database workloads
allow_symmetric_joins_without_pruning
controls whether symmetric joins are permitted without partition pruning
repartition_joins
enables join repartitioning for distributed execution
DATAFUSION_OPTIMIZER_REPARTITION_JOINSrecommended: true
Forces optimizer to consider partitioned joins
DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLDrecommended: 0
Disables single partition threshold to force partitioned joins
DATAFUSION_OPTIMIZER_HASH_JOIN_SINGLE_PARTITION_THRESHOLD_ROWSrecommended: 0
Disables row count threshold to force partitioned joins
CLI Commands (1)
cargo run --profile release-nonlto --bin dfbench tpcds --query 99 --iterations 3 --path benchmarks/data/tpcds_sf1 --query_path datafusion/core/tests/tpc-ds --prefer_hash_join truediagnostic
Technical References (34)
CoalesceBatchesStreamcomponentHashJoincomponentMutableArrayDatacomponentarrow_select::concatcomponentNestedLoopJoincomponentselectivityconceptIMDB benchmarkconceptCoalesceBatchesExeccomponentFilterExeccomponenttarget_partitionsconfiguration parameterRepartitionExeccomponentHashJoinExeccomponentradix treeconceptbuild sideconceptprobe sideconceptapply_join_filter_to_indicescomponentbuild_batch_from_indicescomponentNestedLoopJoinExeccomponentUnionArraycomponentbuild_row_join_batchcomponentScalarValue::to_array_of_sizecomponenthash joincomponentprobe operationconceptnext vectorcomponentchain lookupconceptColumnStatisticscomponentTable Statisticscomponentjoin cardinality estimationconceptfilter selectivityconceptTPC-Hconceptsymmetric joinconceptpartition pruningconceptCollectLeftcomponentPartitionedcomponent
Related Insights (13)
Hash join buffering disabled causes probe side to wait for complete build materializationinfo
CoalesceBatches spends 17% of join execution time concatenating small filtered batcheswarning
Batch splitting in joins may cause performance regressionwarning
Nested loop join batch size fix causes performance regression on certain query patternswarning
CoalesceBatchesExec placement after joins misses optimization opportunitiesinfo
Join operations 4-10x slower than competitor on 1e8 row datasetswarning
Hash join input order swap can cause 1000x performance degradationcritical
NestedLoopJoin filter evaluation creates oversized intermediate batcheswarning
Nested Loop Join performance degrades 45x with Union Array types in DataFusion 50critical
Join performance degrades when build side contains non-unique valuesinfo
Join disasters can occur without proper join reordering and statisticscritical
Symmetric joins without pruning can cause performance degradationwarning
Hash join optimizer selects non-partitioned mode causing 52x slower query executioncritical