Agreed that the word DAG makes very less sense to someone new to workflow
orchestration. But it does also show the nature of being acyclic. Sure, as
Bas mentioned, there are ways to workaround it. Still, in my opinion, there
is generally no need for cyclic behavior in workflow orchestration. Most (*if
not all*) cases can be in some way can be covered using an acyclic manner
with multiple runs. Hence, the idempotency. So I would want the "acyclic"
word to stick.

Regards,
Avi

On Tue, Oct 22, 2024 at 12:41 PM <bernd.stroe...@kosakya.de> wrote:

> Brilliant, I am on the way to become an Airflow Fan; so many new ideas.
>
> The Term DAG is misleading; it should be replaced by the more general Term
> Airflow (Workflow) Graph (AFG) or Airflow (Petri) Net (AFN) (maybe without
> a direction);
> and ... these Graphs should be stored in a Graph Database.
>
> Every Node or Sup-Graph of an Airflow Graph (AFG) might be assigned to an
> executable (Python-, Rust-, ... ) member of a library.
>
> A running Graph might have a different structure than a configuration
> Graph.
>
> Forget that if you think it's bullshit.
>
> Best Regards
>
> Bernd Ströhle
> M: +49 171 5357916
> E: bernd.stroe...@gmail.com
>
>
> -----Original Message-----
> From: Igor Kholopov <ikholo...@google.com.INVALID>
> Sent: Tuesday, October 22, 2024 12:02 PM
> To: dev@airflow.apache.org
> Subject: Re: Airflow should deprecate the term "DAG" for end users
>
> Even though the term "DAG" is clearly suboptimal, it is part of Airflow
> DAG definition interface at so many levels, that any attempt to change it
> will only introduce more chaos, not reduce it. The only thing that is worse
> than a poorly chosen name in the code is when there are two ways to define
> the same thing. Countless articles and tutorials will suddenly become
> confusing as they all refer to workflows as "DAG"s.
>
> We are already at risk of scaring the users away with a number of breaking
> changes in Airflow 3, promising even more breaking changes for the most
> basic things is not something that people are looking for. Attempting to
> change the fundamental terms will be interpreted as an even stronger signal
> of project immaturity.
>
> Given that, I oppose the idea of changing the term in the long run. I even
> stricter oppose the idea of deprecating it in the DAG definition interface.
> We better put our time and efforts in other places in Airflow, of which
> there are plenty.
>
> Kind regards,
> Igor
>
> On Tue, Oct 22, 2024 at 10:36 AM Bas Harenslak <b...@astronomer.io.invalid>
> wrote:
>
> > Couple of thoughts:
> >
> > 1. The boundaries/properties of “DAG” have already faded over time,
> > for example there are now several ways to create cyclic graphs, e.g.
> > using the @continuous schedule. I imagine these properties vanishing
> > even more in the future, so from that perspective I support changing
> > “DAG" to a more generic name.
> >
> > 2. How other orchestration frameworks do naming:
> > Dagster: pipeline
> > Prefect: flow
> > Flyte: workflow
> > Temporal: workflow
> > Kestra: flow
> >
> >         I think “workflow” is the most fitting name.
> >
> > 3. Given the large impact of this change, I suggest defining a clear
> > path forward. Would we first introduce the deprecation in Airflow 3,
> > and remove “DAG” in Airflow 4?
> >
> > Bas
> >
> > > On 22 Oct 2024, at 09:22, Neil <neil4r...@gmail.com> wrote:
> > >
> > > I don't see a problem with the term DAG, especially when most other
> > > platforms embrace the term wholeheartedly.
> > > I don't see anything intimidating or confusing about it at all,
> > > changing the term though would be fairly confusing to most users who
> > > have been
> > using
> > > the term for years.
> > >
> > > On Tue, Oct 22, 2024 at 1:18 AM Tzu-ping Chung
> > > <t...@astronomer.io.invalid
> > >
> > > wrote:
> > >
> > >> I totally agree with doing away with the term DAG. The only problem
> > (aside
> > >> from actually telling people—including myself—to stop using the
> > >> term)
> > is to
> > >> come up with a reasonable alternative.
> > >>
> > >> I can’t recall who, but someone mentioned “workflow” is not very
> > accurate
> > >> for Airflow. The term “definition” was proposed, but it’s a bit
> > >> broad; I tried to use it in a few places and kept finding myself
> > >> doubting “what definition?” and wanting to clarify “DAG definition”
> > >> (defeating the purpose).
> > >>
> > >> TP
> > >>
> > >>
> > >>> On 22 Oct 2024, at 13:07, Jens Scheffler
> > >>> <j_scheff...@gmx.de.INVALID>
> > >> wrote:
> > >>>
> > >>> Hi Ryan,
> > >>>
> > >>> Thanks for posting. I share the exactly same observation, had a
> > >>> short
> > >> laight because the DAG question is always an introduction if
> > >> someone
> > joins
> > >> the party. I think a global renaming makes sense. Especially when
> > >> we
> > also
> > >> rename Dataset to Asset this is also a reasonable step. Concepts
> > >> still
> > can
> > >> stay the same.
> > >>>
> > >>> So I hope I don‘t need to join hiding below the desk with you and
> > >>> +1
> > for
> > >> raising the discussion.
> > >>>
> > >>> Technically we can still think if we keep details of python names
> > >>> the
> > >> same because the execution is still a DAG… but user facing it is a
> > workflow.
> > >>>
> > >>> Jens
> > >>>
> > >>> Sent from my Smartphone
> > >>>
> > >>>> On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io
> > .invalid>
> > >> wrote:
> > >>>>
> > >>>> Everyone please sheathe your swords... at least for now.
> > >>>>
> > >>>> The term "DAG" has very little meaning to Airflow users. Indeed,
> > >>>> it
> > has
> > >>>> little meaning outside of some mathematicians and software
> > >>>> engineers
> > for
> > >>>> whom the properties of a DAG actually matter. For someone new to
> > >>>> data engineering or workflow orchestration, one of the first
> > >>>> questions they
> > >> will
> > >>>> likely have is, "what on earth is a DAG?" The answer is almost
> > >>>> always, "It's a directed acyclic graph. You don't need to worry
> > >>>> about what
> > that
> > >>>> means; it's just a term for your workflow."
> > >>>>
> > >>>> The term "DAG" is problematic for at least a couple important
> reasons:
> > >>>>
> > >>>> *Complexity for New Users*: As mentioned above, "DAG" is
> > >>>> unnecessarily intimidating and confusing. We want Airflow to be
> > >>>> approachable, and
> > >> using
> > >>>> technical jargon like "DAG" right off the bat creates an initial
> > >> barrier to
> > >>>> understanding.
> > >>>>
> > >>>> *Disconnect Between DAG and Workflow Concepts*: The DAG is just
> > >>>> one component of an Airflow workflow. The workflow includes its
> > >>>> schedule, retries, timeouts, a dozen other parameters, and other
> > >>>> metadata that
> > the
> > >>>> DAG component doesn’t account for.
> > >>>>
> > >>>> Consider the following from the Airflow homepage
> > >>>> <https://airflow.apache.org/>.
> > >>>>
> > >>>> Apache Airflow® is a platform created by the community to
> > >> programmatically
> > >>>> author, schedule and monitor workflows.
> > >>>> Then, if we look at the "What is Airflow?" docs page
> > >>>> <https://airflow.apache.org/docs/apache-airflow/stable/index.html
> > >>>> >,
> > we
> > >> can
> > >>>> see that the docs explain what Airflow is without using "DAG."
> > >>>> It's
> > >> only in
> > >>>> the *workflow* Python code that the term is introduced out of
> > >>>> nowhere
> > >> as a
> > >>>> comment that awkwardly tries to explain it:
> > >>>>
> > >>>> # A DAG represents a workflow, a collection of tasks
> > >>>>
> > >>>> It makes sense to not refer to DAGs in these introductions to
> > >>>> Airflow, because *Airflow doesn't orchestrate DAGs; it orchestrates
> workflows*.
> > >> The
> > >>>> DAG is the model that, for reasons irrelevant to almost every
> > >>>> user, workflows must adhere to.
> > >>>>
> > >>>> So, I propose at least adding an alias for the term "DAG" and
> > >>>> updating documentation to replace "DAG" with "workflow".
> > >>>>
> > >>>> For example, instead of...
> > >>>>
> > >>>> @dag(
> > >>>> schedule="@daily",
> > >>>> ...
> > >>>> dagrun_timeout=timedelta(hours=1)
> > >>>> )
> > >>>>
> > >>>> Users could do...
> > >>>>
> > >>>> @workflow(
> > >>>> schedule="@daily",
> > >>>> ...
> > >>>> run_timeout=timedelta(hours=1)
> > >>>> )
> > >>>>
> > >>>>
> > >>>> And with that... I will start running away.
> > >>>
> > >>>
> > >>> ------------------------------------------------------------------
> > >>> --- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>
> > >>
> > >>
> > >> -------------------------------------------------------------------
> > >> -- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>
> > >>
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>

Reply via email to