Hi all,
I'd like to raise a discussion on a new observability feature I've opened
in PR #68359 and wanted to get maintainer input before it moves further.
*Problem*
When a DAG runs with `catchup=False` and the scheduler advances past missed
data intervals (e.g., after a restart or re-enable), those skips are
silent. No DagRun rows are created, no hook fires. Users who need
visibility into which intervals were skipped have no in-process signal and
must compare `last_automated_data_interval` against wall clock externally.
This was raised in issue #66791.
*Proposed Solution*
The PR adds two symmetric extension points:
1. `on_skipped_intervals_callback` -- a DAG-level callback on the SDK `DAG`
constructor, using the same context-based signature as
`on_success_callback` / `on_failure_callback`. The context includes `dag`,
`reason` ("skipped_intervals"), and `skipped_range` (a `DataInterval` from
the previous automated run's `data_interval_end` to the new run's
`data_interval_start`).
2. `on_intervals_skipped` -- an AIP-61 listener hookspec for plugins that
need in-process observability without reloading the DAG file.
Skipped intervals are detected at scheduled DagRun creation time by
comparing the new run's `data_interval_start` against the prior run's
`data_interval_end`. The callback/listener only fires when `not
dag.catchup` AND a callback or listener is registered, so there is no
overhead for DAGs that do not opt in.
The callback is dispatched via `DagSkippedIntervalsCallbackRequest` ->
`DatabaseCallbackSink` -> DAG File Processor. The listener fires
synchronously in the scheduler.
Tests are passing locally covering scheduler detection, callback routing
and execution, listener firing, and serialization roundtrip.
*Questions for the Community:*
- Is this the right approach for this observability gap, or is there a
preferred pattern?
- Should this go through a formal AIP given it introduces a new DAG
constructor parameter and listener hookspec?
PR: https://github.com/apache/airflow/pull/68359
Original issue: https://github.com/apache/airflow/issues/66791
Happy to address any questions or adjust the approach based on feedback.
Best,
Teghveer Singh Ateliey