Technologies/Apache DataFusion/datafusion.operator.elapsed_time
Apache DataFusionApache DataFusionMetric

datafusion.operator.elapsed_time

Operator execution time
Dimensions:None
Available on:PrometheusPrometheus (1)Native (1)OpenTelemetryOpenTelemetry (1)
Interface Metrics (3)
PrometheusPrometheus
CPU time spent computing in a physical plan operator
Dimensions:None
Native
Time spent in actual computation excluding IO waits
Dimensions:None
OpenTelemetryOpenTelemetry
Cumulative time spent executing a specific physical plan operator
Dimensions:None

Technical Annotations (45)

Configuration Parameters (3)
datafusion.execution.parquet.maximum_parallel_row_group_writersrecommended: 1
Default 1 for min memory; increase for idle cores when writing large files
datafusion.execution.parquet.maximum_buffered_record_batches_per_streamrecommended: 2
Default 2 for min memory; increase with row group writers for better throughput
target_batch_sizerecommended: 8192
Default for CoalesceBatchesExec, may need tuning based on cardinality
CLI Commands (2)
EXPLAIN ANALYZE <query>diagnostic
c.evaluate(batch)?.into_array(batch.num_rows())diagnostic
Technical References (40)
deadlockconceptNestedLoopJoinExeccomponentPR #16996componentTokiocomponentmorsel-driven parallelismconceptAggregateMode::PartialcomponentAggregateMode::FinalcomponentRepartitionExeccomponenthash value reuseconceptSinglePartitionedcomponentarrow::RowcomponentRowConvertercomponentCoalesceBatchesExeccomponentintern()componentdynamic partitioningconceptskipped_aggregation_rows metriccomponentGroupedHashAggregateStreamcomponentarrow_row::variable::encodecomponenthash seedconceptClickBenchcomponentbucket distributionconceptcache localityconceptUTF-8 boundary checksconceptASCII fast pathconceptdate_trunccomponenttimezone offsetconceptquadratic complexityconcepthash collisionconceptFilterExeccomponenttarget_partitionsconfiguration parameterHashJoinExeccomponenthash maskingconceptupdate_hashcomponentcollect_left_inputcomponentUnionArraycomponentbuild_row_join_batchcomponentScalarValue::to_array_of_sizecomponentLarge Union TypecomponentEBVcomponentcoalescecomponent
Related Insights (21)
Blocking memory allocation causes deadlock risk when waiting for memorywarning
Nested loop join with tiny left input and massive right input causes CPU saturation without progresscritical
Undefined pipeline success rate and duration thresholds delay detection of data issueswarning
Writing large parquet files with default parallelism settings underutilizes available coresinfo
Small batch size causes performance degradation through excessive allocationswarning
Tokio async scheduler performs equivalently to custom push-based schedulerinfo
High cardinality aggregations incur triple hashing overhead in multi-phase repartition planswarning
Single-mode aggregation outperforms partial/final for high cardinality by avoiding row conversionswarning
RepartitionExec and CoalesceBatchesExec overhead reduces aggregate performanceinfo
Partial aggregation inefficiency with high cardinality causes performance degradationwarning
RowConverter consumes 75% of aggregation time on high-cardinality group by operationswarning
Hash seed reuse prevents rehashing during aggregation merge phaseinfo
ASCII fast path bypassing improves string function performance up to 5xinfo
Timezone specialization for common cases improves date_trunc by 7xinfo
Same hash seed between HashMaps can cause quadratic complexitywarning
Native DataFusion scan performance optimization opportunities identifiedinfo
CoalesceBatchesExec placement after joins misses optimization opportunitiesinfo
RepartitionExec double hashing causes unnecessary overheadwarning
Duplicate expression evaluation wastes CPU during hash join buildinfo
Nested Loop Join performance degrades 45x with Union Array types in DataFusion 50critical
Complex filter expressions on Union types cause excessive evaluation overhead in Nested Loop Joinswarning