Yeah just say, when asked where the name comes from, "well, no one actually
knows but..." and then make something up.

On Tue, Oct 22, 2024 at 8:31 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Just to clarify - "directed acyclic graph" is the tongue-twister,
>
> On Tue, Oct 22, 2024 at 5:29 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > I like what both Daniel and Brent wrote. I would very much want to be
> able
> > to say just "dag" without explaining it further.
> >
> > For me every time I explain "DAG" at a talk it's a tongue-twister, and I
> > almost stutter on trying to recall how to pronounce it properly.
> >
> > J.
> >
> >
> > On Tue, Oct 22, 2024 at 5:27 PM Brent Bovenzi
> <br...@astronomer.io.invalid>
> > wrote:
> >
> >> I remember we explored renaming "DAG" when starting on AIP-38 to
> modernize
> >> the UI. Both "pipeline" or "workflow" are more descriptive of what one
> is
> >> actually doing while Directed Acyclic Graph is an implementation detail.
> >> But I agree with Daniel Standish, at this point "DAG" has become "dag"
> , a
> >> word in its own right.
> >>
> >> Examples for "dag" are abound in community discussion, Airflow Summit
> >> talks, documentation and even in the UI. Let's embrace "dag". A user
> just
> >> needs to learn one new word vs the technical concept behind that word. I
> >> think that is much less effort than refactoring so much code,
> >> documentation, blog posts, stack overflow questions, etc.
> >>
> >> On Tue, Oct 22, 2024 at 10:51 AM Daniel Standish
> >> <daniel.stand...@astronomer.io.invalid> wrote:
> >>
> >> > I am skeptical.  Seems like introducing a lot of pain for questionable
> >> > benefit.  But, I am def sympathetic to the idea.  I agree the
> >> association
> >> > with "directed acyclic graph" is not helpful.
> >> >
> >> > And along those lines, I offer here some less invasive mitigations.
> >> >
> >> > One thing we can do no matter what is to de-emphasize the math nerd
> >> origins
> >> > of the name.  That is to say, in docs / website / etc, *never define*
> >> > airflow's "dag" concept as a directed acyclic graph.  Always define it
> >> as a
> >> > pipeline, collection of tasks, workflow etc.
> >> >
> >> > The "directed acyclic graph" part of it is like a historical footnote,
> >> and
> >> > we could make one mention of it somewhere hidden.
> >> >
> >> > We could also start using lowercase in the docs in general e.g.
> writing
> >> > "dag" / "dags" instead of writing "DAG" / "DAGs" etc.  The upper case
> >> part
> >> > of it makes it look like an acronym; but "dag" in airlfow is just an
> >> > airflow concept and the association with "DAGs" is not really
> unhelpful.
> >> >
> >> > In other words embrace that "dag" in airflow is its own thing, is
> >> > *not* strictly
> >> > speaking a directed acyclic graph (which nobody knows about anyway),
> and
> >> > tell them what it is in simple terms that normal people understand.
> >> >
> >> >
> >> > On Tue, Oct 22, 2024 at 7:27 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >> >
> >> > > DAG is so embedded into what we do that it will be extremely
> >> difficult to
> >> > > get rid of it completely. Also I think it will make a lot of
> "google"
> >> > > searches and "stack overflow" searches not finding the right
> answers.
> >> > This
> >> > > is one of the strengths of Airflow - besides the community and ideas
> >> that
> >> > > Bernd mentioned - is the vast number of examples, problems and
> >> solutions
> >> > > you can so easily find (and we have to remember that all the AI
> >> trained
> >> > on
> >> > > past data will be also rather poorly matching queries of people.
> >> > >
> >> > > I am not too attached to DAG. I could easily switch. And if we do -
> I
> >> > > would be for using workflow or pipeline instead of `dag` if not the
> >> above
> >> > > reason, but I think I am here with Igor that it might cause more
> >> problems
> >> > > than it solves.
> >> > >
> >> > > But I am not 100% against - if others will think it's a good idea, I
> >> am
> >> > ok
> >> > > with it.
> >> > >
> >> > > J,
> >> > >
> >> > >
> >> > > On Tue, Oct 22, 2024 at 3:12 PM Abhishek Bhakat
> >> > > <abhishek.bha...@astronomer.io.invalid> wrote:
> >> > >
> >> > > > Agreed that the word DAG makes very less sense to someone new to
> >> > workflow
> >> > > > orchestration. But it does also show the nature of being acyclic.
> >> Sure,
> >> > > as
> >> > > > Bas mentioned, there are ways to workaround it. Still, in my
> >> opinion,
> >> > > there
> >> > > > is generally no need for cyclic behavior in workflow
> orchestration.
> >> > Most
> >> > > > (*if
> >> > > > not all*) cases can be in some way can be covered using an acyclic
> >> > manner
> >> > > > with multiple runs. Hence, the idempotency. So I would want the
> >> > "acyclic"
> >> > > > word to stick.
> >> > > >
> >> > > > Regards,
> >> > > > Avi
> >> > > >
> >> > > > On Tue, Oct 22, 2024 at 12:41 PM <bernd.stroe...@kosakya.de>
> wrote:
> >> > > >
> >> > > > > Brilliant, I am on the way to become an Airflow Fan; so many new
> >> > ideas.
> >> > > > >
> >> > > > > The Term DAG is misleading; it should be replaced by the more
> >> general
> >> > > > Term
> >> > > > > Airflow (Workflow) Graph (AFG) or Airflow (Petri) Net (AFN)
> (maybe
> >> > > > without
> >> > > > > a direction);
> >> > > > > and ... these Graphs should be stored in a Graph Database.
> >> > > > >
> >> > > > > Every Node or Sup-Graph of an Airflow Graph (AFG) might be
> >> assigned
> >> > to
> >> > > an
> >> > > > > executable (Python-, Rust-, ... ) member of a library.
> >> > > > >
> >> > > > > A running Graph might have a different structure than a
> >> configuration
> >> > > > > Graph.
> >> > > > >
> >> > > > > Forget that if you think it's bullshit.
> >> > > > >
> >> > > > > Best Regards
> >> > > > >
> >> > > > > Bernd Ströhle
> >> > > > > M: +49 171 5357916
> >> > > > > E: bernd.stroe...@gmail.com
> >> > > > >
> >> > > > >
> >> > > > > -----Original Message-----
> >> > > > > From: Igor Kholopov <ikholo...@google.com.INVALID>
> >> > > > > Sent: Tuesday, October 22, 2024 12:02 PM
> >> > > > > To: dev@airflow.apache.org
> >> > > > > Subject: Re: Airflow should deprecate the term "DAG" for end
> users
> >> > > > >
> >> > > > > Even though the term "DAG" is clearly suboptimal, it is part of
> >> > Airflow
> >> > > > > DAG definition interface at so many levels, that any attempt to
> >> > change
> >> > > it
> >> > > > > will only introduce more chaos, not reduce it. The only thing
> >> that is
> >> > > > worse
> >> > > > > than a poorly chosen name in the code is when there are two ways
> >> to
> >> > > > define
> >> > > > > the same thing. Countless articles and tutorials will suddenly
> >> become
> >> > > > > confusing as they all refer to workflows as "DAG"s.
> >> > > > >
> >> > > > > We are already at risk of scaring the users away with a number
> of
> >> > > > breaking
> >> > > > > changes in Airflow 3, promising even more breaking changes for
> the
> >> > most
> >> > > > > basic things is not something that people are looking for.
> >> Attempting
> >> > > to
> >> > > > > change the fundamental terms will be interpreted as an even
> >> stronger
> >> > > > signal
> >> > > > > of project immaturity.
> >> > > > >
> >> > > > > Given that, I oppose the idea of changing the term in the long
> >> run. I
> >> > > > even
> >> > > > > stricter oppose the idea of deprecating it in the DAG definition
> >> > > > interface.
> >> > > > > We better put our time and efforts in other places in Airflow,
> of
> >> > which
> >> > > > > there are plenty.
> >> > > > >
> >> > > > > Kind regards,
> >> > > > > Igor
> >> > > > >
> >> > > > > On Tue, Oct 22, 2024 at 10:36 AM Bas Harenslak
> >> > > <b...@astronomer.io.invalid
> >> > > > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Couple of thoughts:
> >> > > > > >
> >> > > > > > 1. The boundaries/properties of “DAG” have already faded over
> >> time,
> >> > > > > > for example there are now several ways to create cyclic
> graphs,
> >> > e.g.
> >> > > > > > using the @continuous schedule. I imagine these properties
> >> > vanishing
> >> > > > > > even more in the future, so from that perspective I support
> >> > changing
> >> > > > > > “DAG" to a more generic name.
> >> > > > > >
> >> > > > > > 2. How other orchestration frameworks do naming:
> >> > > > > > Dagster: pipeline
> >> > > > > > Prefect: flow
> >> > > > > > Flyte: workflow
> >> > > > > > Temporal: workflow
> >> > > > > > Kestra: flow
> >> > > > > >
> >> > > > > >         I think “workflow” is the most fitting name.
> >> > > > > >
> >> > > > > > 3. Given the large impact of this change, I suggest defining a
> >> > clear
> >> > > > > > path forward. Would we first introduce the deprecation in
> >> Airflow
> >> > 3,
> >> > > > > > and remove “DAG” in Airflow 4?
> >> > > > > >
> >> > > > > > Bas
> >> > > > > >
> >> > > > > > > On 22 Oct 2024, at 09:22, Neil <neil4r...@gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > I don't see a problem with the term DAG, especially when
> most
> >> > other
> >> > > > > > > platforms embrace the term wholeheartedly.
> >> > > > > > > I don't see anything intimidating or confusing about it at
> >> all,
> >> > > > > > > changing the term though would be fairly confusing to most
> >> users
> >> > > who
> >> > > > > > > have been
> >> > > > > > using
> >> > > > > > > the term for years.
> >> > > > > > >
> >> > > > > > > On Tue, Oct 22, 2024 at 1:18 AM Tzu-ping Chung
> >> > > > > > > <t...@astronomer.io.invalid
> >> > > > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > >> I totally agree with doing away with the term DAG. The only
> >> > > problem
> >> > > > > > (aside
> >> > > > > > >> from actually telling people—including myself—to stop using
> >> the
> >> > > > > > >> term)
> >> > > > > > is to
> >> > > > > > >> come up with a reasonable alternative.
> >> > > > > > >>
> >> > > > > > >> I can’t recall who, but someone mentioned “workflow” is not
> >> very
> >> > > > > > accurate
> >> > > > > > >> for Airflow. The term “definition” was proposed, but it’s a
> >> bit
> >> > > > > > >> broad; I tried to use it in a few places and kept finding
> >> myself
> >> > > > > > >> doubting “what definition?” and wanting to clarify “DAG
> >> > > definition”
> >> > > > > > >> (defeating the purpose).
> >> > > > > > >>
> >> > > > > > >> TP
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>> On 22 Oct 2024, at 13:07, Jens Scheffler
> >> > > > > > >>> <j_scheff...@gmx.de.INVALID>
> >> > > > > > >> wrote:
> >> > > > > > >>>
> >> > > > > > >>> Hi Ryan,
> >> > > > > > >>>
> >> > > > > > >>> Thanks for posting. I share the exactly same observation,
> >> had a
> >> > > > > > >>> short
> >> > > > > > >> laight because the DAG question is always an introduction
> if
> >> > > > > > >> someone
> >> > > > > > joins
> >> > > > > > >> the party. I think a global renaming makes sense.
> Especially
> >> > when
> >> > > > > > >> we
> >> > > > > > also
> >> > > > > > >> rename Dataset to Asset this is also a reasonable step.
> >> Concepts
> >> > > > > > >> still
> >> > > > > > can
> >> > > > > > >> stay the same.
> >> > > > > > >>>
> >> > > > > > >>> So I hope I don‘t need to join hiding below the desk with
> >> you
> >> > and
> >> > > > > > >>> +1
> >> > > > > > for
> >> > > > > > >> raising the discussion.
> >> > > > > > >>>
> >> > > > > > >>> Technically we can still think if we keep details of
> python
> >> > names
> >> > > > > > >>> the
> >> > > > > > >> same because the execution is still a DAG… but user facing
> it
> >> > is a
> >> > > > > > workflow.
> >> > > > > > >>>
> >> > > > > > >>> Jens
> >> > > > > > >>>
> >> > > > > > >>> Sent from my Smartphone
> >> > > > > > >>>
> >> > > > > > >>>> On 21. Oct 2024, at 23:56, Ryan Hatter <
> >> > > ryan.hat...@astronomer.io
> >> > > > > > .invalid>
> >> > > > > > >> wrote:
> >> > > > > > >>>>
> >> > > > > > >>>> Everyone please sheathe your swords... at least for now.
> >> > > > > > >>>>
> >> > > > > > >>>> The term "DAG" has very little meaning to Airflow users.
> >> > Indeed,
> >> > > > > > >>>> it
> >> > > > > > has
> >> > > > > > >>>> little meaning outside of some mathematicians and
> software
> >> > > > > > >>>> engineers
> >> > > > > > for
> >> > > > > > >>>> whom the properties of a DAG actually matter. For someone
> >> new
> >> > to
> >> > > > > > >>>> data engineering or workflow orchestration, one of the
> >> first
> >> > > > > > >>>> questions they
> >> > > > > > >> will
> >> > > > > > >>>> likely have is, "what on earth is a DAG?" The answer is
> >> almost
> >> > > > > > >>>> always, "It's a directed acyclic graph. You don't need to
> >> > worry
> >> > > > > > >>>> about what
> >> > > > > > that
> >> > > > > > >>>> means; it's just a term for your workflow."
> >> > > > > > >>>>
> >> > > > > > >>>> The term "DAG" is problematic for at least a couple
> >> important
> >> > > > > reasons:
> >> > > > > > >>>>
> >> > > > > > >>>> *Complexity for New Users*: As mentioned above, "DAG" is
> >> > > > > > >>>> unnecessarily intimidating and confusing. We want Airflow
> >> to
> >> > be
> >> > > > > > >>>> approachable, and
> >> > > > > > >> using
> >> > > > > > >>>> technical jargon like "DAG" right off the bat creates an
> >> > initial
> >> > > > > > >> barrier to
> >> > > > > > >>>> understanding.
> >> > > > > > >>>>
> >> > > > > > >>>> *Disconnect Between DAG and Workflow Concepts*: The DAG
> is
> >> > just
> >> > > > > > >>>> one component of an Airflow workflow. The workflow
> includes
> >> > its
> >> > > > > > >>>> schedule, retries, timeouts, a dozen other parameters,
> and
> >> > other
> >> > > > > > >>>> metadata that
> >> > > > > > the
> >> > > > > > >>>> DAG component doesn’t account for.
> >> > > > > > >>>>
> >> > > > > > >>>> Consider the following from the Airflow homepage
> >> > > > > > >>>> <https://airflow.apache.org/>.
> >> > > > > > >>>>
> >> > > > > > >>>> Apache Airflow® is a platform created by the community to
> >> > > > > > >> programmatically
> >> > > > > > >>>> author, schedule and monitor workflows.
> >> > > > > > >>>> Then, if we look at the "What is Airflow?" docs page
> >> > > > > > >>>> <
> >> > > https://airflow.apache.org/docs/apache-airflow/stable/index.html
> >> > > > > > >>>> >,
> >> > > > > > we
> >> > > > > > >> can
> >> > > > > > >>>> see that the docs explain what Airflow is without using
> >> "DAG."
> >> > > > > > >>>> It's
> >> > > > > > >> only in
> >> > > > > > >>>> the *workflow* Python code that the term is introduced
> out
> >> of
> >> > > > > > >>>> nowhere
> >> > > > > > >> as a
> >> > > > > > >>>> comment that awkwardly tries to explain it:
> >> > > > > > >>>>
> >> > > > > > >>>> # A DAG represents a workflow, a collection of tasks
> >> > > > > > >>>>
> >> > > > > > >>>> It makes sense to not refer to DAGs in these
> introductions
> >> to
> >> > > > > > >>>> Airflow, because *Airflow doesn't orchestrate DAGs; it
> >> > > > orchestrates
> >> > > > > workflows*.
> >> > > > > > >> The
> >> > > > > > >>>> DAG is the model that, for reasons irrelevant to almost
> >> every
> >> > > > > > >>>> user, workflows must adhere to.
> >> > > > > > >>>>
> >> > > > > > >>>> So, I propose at least adding an alias for the term "DAG"
> >> and
> >> > > > > > >>>> updating documentation to replace "DAG" with "workflow".
> >> > > > > > >>>>
> >> > > > > > >>>> For example, instead of...
> >> > > > > > >>>>
> >> > > > > > >>>> @dag(
> >> > > > > > >>>> schedule="@daily",
> >> > > > > > >>>> ...
> >> > > > > > >>>> dagrun_timeout=timedelta(hours=1)
> >> > > > > > >>>> )
> >> > > > > > >>>>
> >> > > > > > >>>> Users could do...
> >> > > > > > >>>>
> >> > > > > > >>>> @workflow(
> >> > > > > > >>>> schedule="@daily",
> >> > > > > > >>>> ...
> >> > > > > > >>>> run_timeout=timedelta(hours=1)
> >> > > > > > >>>> )
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> And with that... I will start running away.
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > ------------------------------------------------------------------
> >> > > > > > >>> --- To unsubscribe, e-mail:
> >> dev-unsubscr...@airflow.apache.org
> >> > > > > > >>> For additional commands, e-mail:
> >> dev-h...@airflow.apache.org
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > -------------------------------------------------------------------
> >> > > > > > >> -- To unsubscribe, e-mail:
> >> dev-unsubscr...@airflow.apache.org
> >> > > > > > >> For additional commands, e-mail:
> dev-h...@airflow.apache.org
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> ---------------------------------------------------------------------
> >> > > > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> >> > > > > For additional commands, e-mail: dev-h...@airflow.apache.org
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to