shivaam opened a new pull request, #64089:
URL: https://github.com/apache/airflow/pull/64089
Adds progress tracking to the backfill UI — the banner and Backfills list
page now
show a progress bar (green for success, red for failed, gray for remaining)
with a
completion count (e.g., "3/6"). Completed backfills show "Completed" text.
### Changes
**Backend:**
- Added `num_runs` and `dag_run_state_counts` to `BackfillResponse`
- Single enrichment query (JOIN `backfill_dag_run` + `dag_run`, GROUP BY
state)
- `num_runs` derived by summing state counts (inner join excludes NULL
dag_run_id rows)
- All backfill endpoints enriched (list, get, pause, unpause, cancel)
**Frontend:**
- New `BackfillProgressBar` shared component (used by both banner and table)
- Completed backfills → "Completed" text, active → progress bar
**Tests:**
- `test_get_backfill_with_dag_run_state_counts` covering mixed states
- All existing assertions updated for new fields
### Open design questions — requesting reviewer input
**Should this be a new API endpoint instead of enriching BackfillResponse?**
We considered `GET /backfills/{id}/dagRuns` returning the linked dag runs
with state,
letting the UI compute counts client-side. Reasons it may be better:
- Keeps `BackfillResponse` stable (no new fields on the public v2 API)
- Enables linking individual dag runs to backfills in the UI (not possible
today)
- Supports future CLI `airflow backfill status` use case
- UI can poll progress independently of backfill metadata
The current approach (enriching BackfillResponse) was simpler to ship but
couples
progress data to every backfill fetch. Happy to refactor if reviewers prefer
a
separate endpoint.
**Other questions for reviewers:**
1. **Should `num_runs` include skipped `BackfillDagRun` rows?** Currently it
only
counts rows with an actual DagRun (inner join). Skipped slots
(`exception_reason = "already exists"` / `"in flight"`) are excluded.
Including
them would change the denominator from "progress of created runs" to
"progress
across entire date range." The tradeoff: including skipped slots gives a
more
complete picture but a "Missing Runs" backfill that skipped all dates
would show
"0/6" which is confusing.
2. **Should completed backfills show final stats?** Currently shows
"Completed" text
and hides the bar. A backfill that finished 95 success + 5 failed looks
the same
as 100 success. Showing the final breakdown would be more informative but
the
underlying data has a limitation: when a newer backfill reprocesses the
same dates,
`DagRun.backfill_id` gets reassigned, so old backfills lose their state
counts.
"Completed" text avoids exposing this stale data.
3. **Should running/queued be visually distinct?** The bar currently has
three
segments: green (success), red (failed), gray (remaining). Running and
queued are
lumped into gray. Splitting them out adds information but also visual
complexity.
4. **Should `_enrich_backfill_responses` move to a shared utility?** It is
currently
in `routes/public/backfills.py` and imported by `routes/ui/backfills.py`,
creating
a cross-dependency between route modules.
---
##### Was generative AI tooling used to co-author this PR?
- [X] Yes — Claude Code (claude-opus-4-6)
Generated-by: Claude Code (claude-opus-4-6) following [the
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]