Hi Ryan,

Thanks for posting. I share the exactly same observation, had a short laight 
because the DAG question is always an introduction if someone joins the party. 
I think a global renaming makes sense. Especially when we also rename Dataset 
to Asset this is also a reasonable step. Concepts still can stay the same.

So I hope I don‘t need to join hiding below the desk with you and +1 for 
raising the discussion.

Technically we can still think if we keep details of python names the same 
because the execution is still a DAG… but user facing it is a workflow.

Jens

Sent from my Smartphone

> On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io.invalid> 
> wrote:
> 
> Everyone please sheathe your swords... at least for now.
> 
> The term "DAG" has very little meaning to Airflow users. Indeed, it has
> little meaning outside of some mathematicians and software engineers for
> whom the properties of a DAG actually matter. For someone new to data
> engineering or workflow orchestration, one of the first questions they will
> likely have is, "what on earth is a DAG?" The answer is almost always,
> "It's a directed acyclic graph. You don't need to worry about what that
> means; it's just a term for your workflow."
> 
> The term "DAG" is problematic for at least a couple important reasons:
> 
> *Complexity for New Users*: As mentioned above, "DAG" is unnecessarily
> intimidating and confusing. We want Airflow to be approachable, and using
> technical jargon like "DAG" right off the bat creates an initial barrier to
> understanding.
> 
> *Disconnect Between DAG and Workflow Concepts*: The DAG is just one
> component of an Airflow workflow. The workflow includes its schedule,
> retries, timeouts, a dozen other parameters, and other metadata that the
> DAG component doesn’t account for.
> 
> Consider the following from the Airflow homepage
> <https://airflow.apache.org/>.
> 
> Apache Airflow® is a platform created by the community to programmatically
> author, schedule and monitor workflows.
> Then, if we look at the "What is Airflow?" docs page
> <https://airflow.apache.org/docs/apache-airflow/stable/index.html>, we can
> see that the docs explain what Airflow is without using "DAG." It's only in
> the *workflow* Python code that the term is introduced out of nowhere as a
> comment that awkwardly tries to explain it:
> 
> # A DAG represents a workflow, a collection of tasks
> 
> It makes sense to not refer to DAGs in these introductions to Airflow,
> because *Airflow doesn't orchestrate DAGs; it orchestrates workflows*. The
> DAG is the model that, for reasons irrelevant to almost every user,
> workflows must adhere to.
> 
> So, I propose at least adding an alias for the term "DAG" and updating
> documentation to replace "DAG" with "workflow".
> 
> For example, instead of...
> 
> @dag(
> schedule="@daily",
> ...
> dagrun_timeout=timedelta(hours=1)
> )
> 
> Users could do...
> 
> @workflow(
> schedule="@daily",
> ...
> run_timeout=timedelta(hours=1)
> )
> 
> 
> And with that... I will start running away.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to