Hi Ryan, Thanks for posting. I share the exactly same observation, had a short laight because the DAG question is always an introduction if someone joins the party. I think a global renaming makes sense. Especially when we also rename Dataset to Asset this is also a reasonable step. Concepts still can stay the same.
So I hope I don‘t need to join hiding below the desk with you and +1 for raising the discussion. Technically we can still think if we keep details of python names the same because the execution is still a DAG… but user facing it is a workflow. Jens Sent from my Smartphone > On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io.invalid> > wrote: > > Everyone please sheathe your swords... at least for now. > > The term "DAG" has very little meaning to Airflow users. Indeed, it has > little meaning outside of some mathematicians and software engineers for > whom the properties of a DAG actually matter. For someone new to data > engineering or workflow orchestration, one of the first questions they will > likely have is, "what on earth is a DAG?" The answer is almost always, > "It's a directed acyclic graph. You don't need to worry about what that > means; it's just a term for your workflow." > > The term "DAG" is problematic for at least a couple important reasons: > > *Complexity for New Users*: As mentioned above, "DAG" is unnecessarily > intimidating and confusing. We want Airflow to be approachable, and using > technical jargon like "DAG" right off the bat creates an initial barrier to > understanding. > > *Disconnect Between DAG and Workflow Concepts*: The DAG is just one > component of an Airflow workflow. The workflow includes its schedule, > retries, timeouts, a dozen other parameters, and other metadata that the > DAG component doesn’t account for. > > Consider the following from the Airflow homepage > <https://airflow.apache.org/>. > > Apache Airflow® is a platform created by the community to programmatically > author, schedule and monitor workflows. > Then, if we look at the "What is Airflow?" docs page > <https://airflow.apache.org/docs/apache-airflow/stable/index.html>, we can > see that the docs explain what Airflow is without using "DAG." It's only in > the *workflow* Python code that the term is introduced out of nowhere as a > comment that awkwardly tries to explain it: > > # A DAG represents a workflow, a collection of tasks > > It makes sense to not refer to DAGs in these introductions to Airflow, > because *Airflow doesn't orchestrate DAGs; it orchestrates workflows*. The > DAG is the model that, for reasons irrelevant to almost every user, > workflows must adhere to. > > So, I propose at least adding an alias for the term "DAG" and updating > documentation to replace "DAG" with "workflow". > > For example, instead of... > > @dag( > schedule="@daily", > ... > dagrun_timeout=timedelta(hours=1) > ) > > Users could do... > > @workflow( > schedule="@daily", > ... > run_timeout=timedelta(hours=1) > ) > > > And with that... I will start running away. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org