Hi everyone, It’s time to have a another discussion about everyone's favourite discussion - naming things!
Tl;dr if you have all of AIP-72 and its implications loaded in your head already: ## Where should DAG, TaskGroup, Labels, decorators etc for authoring be imported from inside the DAG files? Similarly for DagRun, TaskInstance etc. (these likely won’t be created directly by users, but just used for reference docs/type hints/editor completion) ## Assuming most people don’t fall in to that category, read on :) Right now users import things into their DAG files from a few places. Some/most of these are (now) documented in https://airflow.apache.org/docs/apache-airflow/stable/public-airflow-interface.html ``` from airflow import DAG from airflow.decorators import task, task_group from airflow.utils.task_group import TaskGroup from airflow.utils.edgemodifier import Label # For adding labels between nodes on graph ``` The following packages are linked to from that doc too, so they are I guess considered quasi-public: airflow.exceptions airflow.models.dag airflow.models.dagbag airflow.models.param airflow.models.dagrun airflow.models.connection airflow.models.variable airflow.models.xcom airflow.utils.state airflow.hooks So as part of my work on AIP-72/Task Execution interface and SDK I want to tidy these up and “unify” the imports. My thinking is as follows: 1. Users should never import things from airflow.models (and in Airflow 3 it will be impossible to do so outside of compatibility shims) 2. “TaskGroup” and the state enums should not be imported by users from utils (More generally I don’t like “utils” as a namespace/package as I find it’s where code just get’s dumped, but that’s a separate point.) On the subject of Hooks, I think we should consider moving `get_connection` off of BaseHook anyway (it’ll be implemented totally differently behind an API anyway) on to a class method on Connection. So now to the crux of the naming debate, and repeating the question from the top: Where should DAG, TaskGroup, Labels, decorators etc for authoring be imported from inside the DAG files? Similarly for DagRun, TaskInstance (these two likely won’t be created directly by users, but just used for reference docs/type hints) We don’t have to worry about breaking things or needing every dag to be re-written as I already have a way of maintaining backwards-compatibility via a shim, so the please think of this as “Given a Greenfield, where should these imports live for our users”/“What makes most sense to see in DAG files”. I have some rough ideas but would like to get other people's views here first. Cheers, Ash