I totally agree with doing away with the term DAG. The only problem (aside from 
actually telling people—including myself—to stop using the term) is to come up 
with a reasonable alternative.

I can’t recall who, but someone mentioned “workflow” is not very accurate for 
Airflow. The term “definition” was proposed, but it’s a bit broad; I tried to 
use it in a few places and kept finding myself doubting “what definition?” and 
wanting to clarify “DAG definition” (defeating the purpose).

TP


> On 22 Oct 2024, at 13:07, Jens Scheffler <j_scheff...@gmx.de.INVALID> wrote:
> 
> Hi Ryan,
> 
> Thanks for posting. I share the exactly same observation, had a short laight 
> because the DAG question is always an introduction if someone joins the 
> party. I think a global renaming makes sense. Especially when we also rename 
> Dataset to Asset this is also a reasonable step. Concepts still can stay the 
> same.
> 
> So I hope I don‘t need to join hiding below the desk with you and +1 for 
> raising the discussion.
> 
> Technically we can still think if we keep details of python names the same 
> because the execution is still a DAG… but user facing it is a workflow.
> 
> Jens
> 
> Sent from my Smartphone
> 
>> On 21. Oct 2024, at 23:56, Ryan Hatter <ryan.hat...@astronomer.io.invalid> 
>> wrote:
>> 
>> Everyone please sheathe your swords... at least for now.
>> 
>> The term "DAG" has very little meaning to Airflow users. Indeed, it has
>> little meaning outside of some mathematicians and software engineers for
>> whom the properties of a DAG actually matter. For someone new to data
>> engineering or workflow orchestration, one of the first questions they will
>> likely have is, "what on earth is a DAG?" The answer is almost always,
>> "It's a directed acyclic graph. You don't need to worry about what that
>> means; it's just a term for your workflow."
>> 
>> The term "DAG" is problematic for at least a couple important reasons:
>> 
>> *Complexity for New Users*: As mentioned above, "DAG" is unnecessarily
>> intimidating and confusing. We want Airflow to be approachable, and using
>> technical jargon like "DAG" right off the bat creates an initial barrier to
>> understanding.
>> 
>> *Disconnect Between DAG and Workflow Concepts*: The DAG is just one
>> component of an Airflow workflow. The workflow includes its schedule,
>> retries, timeouts, a dozen other parameters, and other metadata that the
>> DAG component doesn’t account for.
>> 
>> Consider the following from the Airflow homepage
>> <https://airflow.apache.org/>.
>> 
>> Apache Airflow® is a platform created by the community to programmatically
>> author, schedule and monitor workflows.
>> Then, if we look at the "What is Airflow?" docs page
>> <https://airflow.apache.org/docs/apache-airflow/stable/index.html>, we can
>> see that the docs explain what Airflow is without using "DAG." It's only in
>> the *workflow* Python code that the term is introduced out of nowhere as a
>> comment that awkwardly tries to explain it:
>> 
>> # A DAG represents a workflow, a collection of tasks
>> 
>> It makes sense to not refer to DAGs in these introductions to Airflow,
>> because *Airflow doesn't orchestrate DAGs; it orchestrates workflows*. The
>> DAG is the model that, for reasons irrelevant to almost every user,
>> workflows must adhere to.
>> 
>> So, I propose at least adding an alias for the term "DAG" and updating
>> documentation to replace "DAG" with "workflow".
>> 
>> For example, instead of...
>> 
>> @dag(
>> schedule="@daily",
>> ...
>> dagrun_timeout=timedelta(hours=1)
>> )
>> 
>> Users could do...
>> 
>> @workflow(
>> schedule="@daily",
>> ...
>> run_timeout=timedelta(hours=1)
>> )
>> 
>> 
>> And with that... I will start running away.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to