datafusion.operator.memory_used
Memory used by operatorDimensions:None
Available on:Native (1)
Interface Metrics (1)
Sources
Technical Annotations (75)
Configuration Parameters (11)
datafusion.execution.sort_spill_reservation_bytesrecommended: 10485760datafusion.execution.batch_sizerecommended: 8192datafusion.execution.coalesce_batchesrecommended: truedatafusion.execution.parquet.maximum_parallel_row_group_writersrecommended: 1datafusion.execution.parquet.maximum_buffered_record_batches_per_streamrecommended: 2datafusion.execution.target_partitionsrecommended: 1-4memory_limitrecommended: Set 500MB below actual available memoryMEMORY_FRACTIONrecommended: 1.0batch_sizerecommended: Reduce from default (e.g., from 8192 to lower value)memory_pool.limitrecommended: increase from 1600 bytes minimumdatafusion.memory_pool.limitError Signatures (11)
overflowexceptionpanicked at /Users/lili/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-select-57.3.0/src/interleave.rs:180:41exceptionmemory allocation of 25690112 bytes failedexceptionDatafusionError/ResourcesExhausted: Failed to allocate additionalexceptionAborted (core dumped)exit codeArrowError(InvalidArgumentError("number of columns(3) must match number of fields(2) in schema"), None)exceptionnumber of columns must match number of fields in schemalog patternResourcesExhaustedexceptionintermediate_batch.num_rows() = 335544320log patternintermediate_batch.get_array_memory_size() = 5368709312log patternOOMerror codeCLI Commands (3)
set datafusion.execution.target_partitions = 1;diagnosticexplain SELECT "WatchID", "ClientIP", COUNT(*) AS c FROM hits GROUP BY "WatchID", "ClientIP";diagnosticulimit -v 1152000diagnosticTechnical References (50)
fair poolcomponentspillable consumerconceptTopKHeap::emit_with_state()componentinterleave_record_batch()componentarrow-selectcomponentUtf8componenti32::MAXconceptRepartitionExeccomponentpull_from_inputcomponentoutput_channelscomponentRoundRobinBatchcomponentRowFormatcomponentdictionary interningconceptAggregateExec: mode=PartialcomponentAggregateExec: mode=FinalPartitionedcomponentTop K optimizationconceptSortPreservingMergeExeccomponentGlobalLimitExeccomponentGroupedHashAggregateStreamcomponentMemoryPoolcomponentgroup_aggregate_batch()componentVec::grow_amortized()componentdatafusion/physical-plan/src/aggregates/row_hash.rsfile pathAggregateExeccomponentFairSpillPoolcomponentNestedLoopJoinExeccomponentrecord batchconceptprobe-sideconceptbuild-sideconceptSortMergeJoincomponentpartitionconceptMemoryReservationcomponentpartitioned hash joinconceptTPC-Hconceptexternal joincomponentTopKcomponentSortcomponentunnestcomponentGROUP BYcomponentarray_aggcomponentstreaming executionconceptCartesian productconceptbuild_batch_from_indicescomponentapply_join_filter_to_indicescomponentnested loop joincomponentbuffered_left_batchescomponentRecordBatchStreamcomponentExecutionPlancomponentnested_loop_join.rsfile pathbatch_transformercomponentRelated Insights (23)
Fair pool unfairly allocates memory between spillable and non-spillable operatorswarning
▸
TopK operator panics on Utf8 string column overflow beyond i32::MAXcritical
▸
RepartitionExec unbounded buffering causes memory spikes with unbalanced partition processingwarning
▸
Sort operations run out of memory when sort_spill_reservation_bytes is insufficientcritical
▸
Tiny output batches cause excessive metadata memory consumptionwarning
▸
Writing large parquet files with default parallelism settings underutilizes available coresinfo
▸
Memory explosion from dictionary interning in row format optimizationwarning
▸
High cardinality aggregations cause memory usage to scale linearly with partition countcritical
▸
GROUP BY with ORDER BY and LIMIT still allocates memory for all groupswarning
▸
GroupedHashAggregateStream OOM from Vec exponential growth during group-by with large stringscritical
▸
Aggregate memory accounting updates only after full batch processingwarning
▸
Schema mismatch causes GroupedHashAggregateStream spill failure with multiple aggregationscritical
▸
RepartitionExec memory exhaustion during aggregation spillwarning
▸
Nested loop join creates excessive memory usage through oversized record batcheswarning
▸
SortMergeJoin memory usage exceeds HashJoin with high partition countswarning
▸
Partitioned hash join memory coordination failure with shared poolwarning
▸
TPC-H queries fail under fuzzed memory limits with external joinscritical
▸
TopK optimization not applied when limit pushdown fails with complex operatorswarning
▸
Unnest with GROUP BY causes unbounded memory growth despite streamingcritical
▸
NestedLoopJoinExec creates extremely large intermediate batches causing memory exhaustioncritical
▸
NestedLoopJoin filter evaluation creates oversized intermediate batcheswarning
▸
Nested loop join buffers entire left side causing OOM under memory constraintscritical
▸
Nested loop join produces massive intermediate result sets consuming memorywarning
▸