I don't see a problem with the term DAG, especially when most other
platforms embrace the term wholeheartedly.
I don't see anything intimidating or confusing about it at all, changing
the term though would be fairly confusing to most users who have been using
the term for years.

On Tue, Oct 22, 2024 at 1:18 AM Tzu-ping Chung <t...@astronomer.io.invalid>
wrote:

> I totally agree with doing away with the term DAG. The only problem (aside
> from actually telling people—including myself—to stop using the term) is to
> come up with a reasonable alternative.
>
> I can’t recall who, but someone mentioned “workflow” is not very accurate
> for Airflow. The term “definition” was proposed, but it’s a bit broad; I
> tried to use it in a few places and kept finding myself doubting “what
> definition?” and wanting to clarify “DAG definition” (defeating the
> purpose).
>
> TP
>
>
> > On 22 Oct 2024, at 13:07, Jens Scheffler <j_scheff...@gmx.de.INVALID>
> wrote:
> >
> > Hi Ryan,
> >
> > Thanks for posting. I share the exactly same observation, had a short
> laight because the DAG question is always an introduction if someone joins
> the party. I think a global renaming makes sense. Especially when we also
> rename Dataset to Asset this is also a reasonable step. Concepts still can
> stay the same.
> >
> > So I hope I don‘t need to join hiding below the desk with you and +1 for
> raising the discussion.
> >
> > Technically we can still think if we keep details of python names the
> same because the execution is still a DAG… but user facing it is a workflow.
> >
> > Jens
> >
> > Sent from my Smartphone
> >
> >> On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io.invalid>
> wrote:
> >>
> >> Everyone please sheathe your swords... at least for now.
> >>
> >> The term "DAG" has very little meaning to Airflow users. Indeed, it has
> >> little meaning outside of some mathematicians and software engineers for
> >> whom the properties of a DAG actually matter. For someone new to data
> >> engineering or workflow orchestration, one of the first questions they
> will
> >> likely have is, "what on earth is a DAG?" The answer is almost always,
> >> "It's a directed acyclic graph. You don't need to worry about what that
> >> means; it's just a term for your workflow."
> >>
> >> The term "DAG" is problematic for at least a couple important reasons:
> >>
> >> *Complexity for New Users*: As mentioned above, "DAG" is unnecessarily
> >> intimidating and confusing. We want Airflow to be approachable, and
> using
> >> technical jargon like "DAG" right off the bat creates an initial
> barrier to
> >> understanding.
> >>
> >> *Disconnect Between DAG and Workflow Concepts*: The DAG is just one
> >> component of an Airflow workflow. The workflow includes its schedule,
> >> retries, timeouts, a dozen other parameters, and other metadata that the
> >> DAG component doesn’t account for.
> >>
> >> Consider the following from the Airflow homepage
> >> <https://airflow.apache.org/>.
> >>
> >> Apache Airflow® is a platform created by the community to
> programmatically
> >> author, schedule and monitor workflows.
> >> Then, if we look at the "What is Airflow?" docs page
> >> <https://airflow.apache.org/docs/apache-airflow/stable/index.html>, we
> can
> >> see that the docs explain what Airflow is without using "DAG." It's
> only in
> >> the *workflow* Python code that the term is introduced out of nowhere
> as a
> >> comment that awkwardly tries to explain it:
> >>
> >> # A DAG represents a workflow, a collection of tasks
> >>
> >> It makes sense to not refer to DAGs in these introductions to Airflow,
> >> because *Airflow doesn't orchestrate DAGs; it orchestrates workflows*.
> The
> >> DAG is the model that, for reasons irrelevant to almost every user,
> >> workflows must adhere to.
> >>
> >> So, I propose at least adding an alias for the term "DAG" and updating
> >> documentation to replace "DAG" with "workflow".
> >>
> >> For example, instead of...
> >>
> >> @dag(
> >> schedule="@daily",
> >> ...
> >> dagrun_timeout=timedelta(hours=1)
> >> )
> >>
> >> Users could do...
> >>
> >> @workflow(
> >> schedule="@daily",
> >> ...
> >> run_timeout=timedelta(hours=1)
> >> )
> >>
> >>
> >> And with that... I will start running away.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>

Reply via email to