I believe this is currently possible by just unselecting “downstream” before you click “Clear” in the UI. It should only clear the one middle task and not the downstream task(s).
I would prefer to not have a more detailed UI to allow to skip (or i want to say “bypass” as “skip” is itself a task state) specific downstream tasks as it might signal to users that it is ideal to specify tasks to bypass when in reality it is only something that should be done on occasion for experiment or troubleshooting as you mention, not a common occurrence. What I can agree to though is the list of buttons on the dialog window to change state of a task is a bit cluttered looking. There probably can be a better UI/UX for that, but I do think being able to check/uncheck downstream task is a way to go, that seems like it will be just as cluttered. Alex Begg On Fri, Jan 28, 2022 at 11:46 AM Hongyi Wang <whyni...@gmail.com> wrote: > Hello everyone, > > I'd like to propose a new feature in Airflow -- allow users to specify tasks > to skip when trigger DAG run. > > From our own experience, this feature can be very useful when doing > experiments, troubleshooting or re-running existing DAGs. And I believe it > can benefit many Airflow users. > > To illustrate the use case, I am going to use this example below. > task-a ☐ -> task-b ☑ -> task-c ☐ > > Suppose we have a DAG containing 3 tasks. To troubleshoot "task-a" and > "task-c", I want to trigger a manual DAG run and skip "task-b" (so I can save > time & resource & focus on other two tasks). To do so, today I have two > options: > > Option 1: Trigger DAG, then manually mark "task-b" as `SUCCESS` > Option 2: Remove "task-b" from my DAG, then trigger DAG > > Neither of the options are great. Option 1 can be troublesome when DAG is > large, and there are multiple tasks I want to skip. Option 2 requires change > in the DAG file, which is not convenient for just troubleshooting. > > Therefore, I would love to discuss how we can provide an easy way for users > to skip tasks when triggering DAG. > > Things to consider are: > 1) We should allow user to specify all tasks to skip at once when trigger DAG > 2) We should retain the dependencies between non-skip tasks (in above > example, "task-c" won't start until "task-a" completes even if we skipped > "task-b") > 3) We should mark skipped task as `SKIPPED` instead of `SUCCESS` to make it > more intuitive > 4) The implementation should be easy, clean and low risk > > Here is my proposed solution (tested locally): > Today, Airflow allow user to pass a JSON to the Dagrun as {{dag_run.conf}} > when triggering DAG. The idea is, before queuing task instances that > satisfies dependences, `scheduler_job.py` (after we make some change) will > filter task instances to skip based on `dag_run.conf` user passes in (e.g. > {"skip_tasks": ["task-b"]}), then mark them as SKIPPED. > > Things I would love to discuss: > - What do you think about this feature? > - What do you think about the proposed solution? > - Did I miss anything that you want to discuss? > - Is it necessary to introduce a new state (e.g. MANUAL_SKIPPED) to > differentiate SKIPPED? > > Howie > >