Ingestion Pipeline Failure Silent Data Quality Gaps
criticalDataHub ingestion jobs failing silently or with warnings, causing metadata gaps that prevent data quality monitoring, lineage tracking, and incident detection from functioning properly.
Monitor ingestion_failure rate and ingestion_warning count across all configured ingestion sources. Alert when failure rate exceeds baseline (e.g., >5% of jobs failing) or when specific critical sources (production databases, BI tools) fail. Track ingestion_time to detect jobs timing out or running abnormally long.
Review ingestion job logs for specific error messages (connection timeouts, authentication failures, schema parsing errors). Compare source_workunits_produced vs sink_workunits_write to identify if ingestion is extracting data but failing to write to DataHub. Check credentials, network connectivity, and source system health. For critical sources, implement redundant ingestion schedules and alerting to DataHub administrators via Slack/email when failures occur.