Apache DataFusion

Record count drop-off between pipeline stages indicates data loss

warning
performanceUpdated Feb 17, 2026(via Exa)
How to detect:

Significant record count decrease between pipeline stages (e.g., source reads 10,000 records but only 500 reach sink), indicating filtering, transformation errors, or data quality issues

Recommended action:

Monitor per-stage record counts in pipeline run details. If drop-off detected, add Error Handler node after transform stages to route failed records to separate output (BigQuery table or GCS file). Examine error records to identify problematic data values