I totally agree with doing away with the term DAG. The only problem (aside from actually telling people—including myself—to stop using the term) is to come up with a reasonable alternative.
I can’t recall who, but someone mentioned “workflow” is not very accurate for Airflow. The term “definition” was proposed, but it’s a bit broad; I tried to use it in a few places and kept finding myself doubting “what definition?” and wanting to clarify “DAG definition” (defeating the purpose). TP > On 22 Oct 2024, at 13:07, Jens Scheffler <j_scheff...@gmx.de.INVALID> wrote: > > Hi Ryan, > > Thanks for posting. I share the exactly same observation, had a short laight > because the DAG question is always an introduction if someone joins the > party. I think a global renaming makes sense. Especially when we also rename > Dataset to Asset this is also a reasonable step. Concepts still can stay the > same. > > So I hope I don‘t need to join hiding below the desk with you and +1 for > raising the discussion. > > Technically we can still think if we keep details of python names the same > because the execution is still a DAG… but user facing it is a workflow. > > Jens > > Sent from my Smartphone > >> On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io.invalid> >> wrote: >> >> Everyone please sheathe your swords... at least for now. >> >> The term "DAG" has very little meaning to Airflow users. Indeed, it has >> little meaning outside of some mathematicians and software engineers for >> whom the properties of a DAG actually matter. For someone new to data >> engineering or workflow orchestration, one of the first questions they will >> likely have is, "what on earth is a DAG?" The answer is almost always, >> "It's a directed acyclic graph. You don't need to worry about what that >> means; it's just a term for your workflow." >> >> The term "DAG" is problematic for at least a couple important reasons: >> >> *Complexity for New Users*: As mentioned above, "DAG" is unnecessarily >> intimidating and confusing. We want Airflow to be approachable, and using >> technical jargon like "DAG" right off the bat creates an initial barrier to >> understanding. >> >> *Disconnect Between DAG and Workflow Concepts*: The DAG is just one >> component of an Airflow workflow. The workflow includes its schedule, >> retries, timeouts, a dozen other parameters, and other metadata that the >> DAG component doesn’t account for. >> >> Consider the following from the Airflow homepage >> <https://airflow.apache.org/>. >> >> Apache Airflow® is a platform created by the community to programmatically >> author, schedule and monitor workflows. >> Then, if we look at the "What is Airflow?" docs page >> <https://airflow.apache.org/docs/apache-airflow/stable/index.html>, we can >> see that the docs explain what Airflow is without using "DAG." It's only in >> the *workflow* Python code that the term is introduced out of nowhere as a >> comment that awkwardly tries to explain it: >> >> # A DAG represents a workflow, a collection of tasks >> >> It makes sense to not refer to DAGs in these introductions to Airflow, >> because *Airflow doesn't orchestrate DAGs; it orchestrates workflows*. The >> DAG is the model that, for reasons irrelevant to almost every user, >> workflows must adhere to. >> >> So, I propose at least adding an alias for the term "DAG" and updating >> documentation to replace "DAG" with "workflow". >> >> For example, instead of... >> >> @dag( >> schedule="@daily", >> ... >> dagrun_timeout=timedelta(hours=1) >> ) >> >> Users could do... >> >> @workflow( >> schedule="@daily", >> ... >> run_timeout=timedelta(hours=1) >> ) >> >> >> And with that... I will start running away. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org