I'm partial to everything that we expect users to use to be importable from
`airflow`, but would love to hear other people's thoughts.

On Fri, Aug 30, 2024 at 5:48 AM Ash Berlin-Taylor <a...@apache.org> wrote:

> Hi everyone,
>
> It’s time to have a another discussion about everyone's favourite
> discussion - naming things!
>
> Tl;dr if you have all of AIP-72 and its implications loaded in your head
> already:
>
> ##
> Where should DAG, TaskGroup, Labels, decorators etc for authoring be
> imported from inside the DAG files? Similarly for DagRun, TaskInstance etc.
> (these likely won’t be created directly by users, but just used for
> reference docs/type hints/editor completion)
> ##
>
> Assuming most people don’t fall in to that category, read on :)
>
> Right now users import things into their DAG files from a few places.
> Some/most of these are (now) documented in
> https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html
>
> ```
> from airflow import DAG
> from airflow.decorators import task, task_group
> from airflow.utils.task_group import TaskGroup
> from airflow.utils.edgemodifier import Label # For adding labels between
> nodes on graph
> ```
>
> The following packages are linked to from that doc too, so they are I
> guess considered quasi-public:
>
>
>     airflow.exceptions
>     airflow.models.dag
>     airflow.models.dagbag
>     airflow.models.param
>     airflow.models.dagrun
>     airflow.models.connection
>     airflow.models.variable
>     airflow.models.xcom
>     airflow.utils.state
>     airflow.hooks
>
> So as part of my work on AIP-72/Task Execution interface and SDK I want to
> tidy these up and “unify” the imports.
>
> My thinking is as follows:
>
> 1. Users should never import things from airflow.models (and in Airflow 3
> it will be impossible to do so outside of compatibility shims)
> 2. “TaskGroup” and the state enums should not be imported by users from
> utils (More generally I don’t like “utils” as a namespace/package as I find
> it’s where code just get’s dumped, but that’s a separate point.)
>
>
> On the subject of Hooks, I think we should consider moving
> `get_connection` off of BaseHook anyway (it’ll be implemented totally
> differently behind an API anyway) on to a class method on Connection.
>
> So now to the crux of the naming debate, and repeating the question from
> the top:
>
> Where should DAG, TaskGroup, Labels, decorators etc for authoring be
> imported from inside the DAG files? Similarly for DagRun, TaskInstance
> (these two likely won’t be created directly by users, but just used for
> reference docs/type hints)
>
> We don’t have to worry about breaking things or needing every dag to be
> re-written as I already have a way of maintaining backwards-compatibility
> via a shim, so the please think of this as “Given a Greenfield, where
> should these imports live for our users”/“What makes most sense to see in
> DAG files”.
>
> I have some rough ideas but would like to get other people's views here
> first.
>
> Cheers,
> Ash

Reply via email to