Should be: ``` @configure_settings @configure_worker_plugins def cli_worker(): pass ```
On Sun, Sep 1, 2024 at 12:05 AM Jarek Potiuk <ja...@potiuk.com> wrote: > Personally for me "airflow.sdk" is best and very straightforward. And we > have not yet used that for other things before, so it's free to use. > > "Models" and similar carried more (often misleading) information - they > were sometimes database models, sometimes they were not. This caused a lot > of confusion. > > IMHO explicitly calling something "sdk" is a clear indication "this is > what you are expected to use". And makes it very clear what is and what is > not a public interface. We should aim to make everything in "airflow.<sdk>" > (or whatever we choose) "public" and everything else "private". That should > also reduce the need of having to have a separate description of "what is > public and what is not". > > Actually - if we continue doing import initialization as we do today - I > would even go as far as the "airflow_sdk" package - unless we do something > else that we have had a problem with for a long time - getting rid of side > effects of "airflow" import. > > It's a bit tangential but actually related - as part of this work we > should IMHO get rid of all side-effects of "import airflow" that we > currently have. If we stick to sub-package of airflow - it is almost a > given thing since "airflow.sdk" (or whatever we choose) will be > available to "worker", "dag file processor" and "triggerer" but the rest of > the "airlfow","whatever" will not be, and they won't be able to use DB, > where scheduler, api_server will. > > So having side effects - such as connecting to the DB, configuring > settings, plugin manager initialization when you do "import" caused a lot > of pain, cyclic imports and a number of other problems. > > I think we should aim to make "initialization" code explicit rather than > implicit (Python zen) - and (possibly via decorators) simply initialize > what is needed and in the right sequence explicitly for each command. If we > will be able to do it "airflow.sdk" is ok, if we will still have "import > airflow" side-effects, The "airflow_sdk" (or similar) is in this case > better, because otherwise we will have to have some ugly conditional code - > when you have and when you do not have database access. > > As an example - If we go for "airflow.sdk" I'd love to see something like > that: > > ``` > @configure_db > @configure_settings > def cli_db(): > pass > > @configure_db > @configure_settings > @configure_ui_plugins > def cli_webserver(): > pass > > @configure_settings > @configure_ui_plugins > def cli_worker(): > pass > ``` > > Rather than that: > > ``` > import airflow <-- here everything gets initialized > ``` > > J > > > On Sat, Aug 31, 2024 at 10:17 PM Jens Scheffler <j_scheff...@gmx.de.invalid> > wrote: > >> Hi Ash, >> >> I was thinking hard... was setting the email aside and still have no >> real _good_ ideas. I am still good with "models" and "sdk". >> >> Actually what we want to define is an "execution interface" to which the >> structual model as API in Python/or other language gives bindings and >> helper methods. For the application it is around DAGs - but naming it >> DAGs is not good because other non-DAG parts as side objects also need >> to belong there. >> >> Other terms which came into my mind were "Schema", "System" and "Plan" >> but all of there are not as good as the previous "models" or "SDK". >> >> API by the way is too brad and generic and smells like remote. So it >> should _not_ be "API". >> >> The term "Definitions" is a bit too long in my view. >> >> So... TLDR... this email is not much of help other than saying that I'd >> propose to use "airflow.models" or "airflow.sdk". If there are no other >> / better ideas coming :-D >> >> Jens >> >> On 30.08.24 19:03, Ash Berlin-Taylor wrote: >> >> As a side note, I wonder if we should do the user-internal separation >> better for DagRun and TaskInstance >> > Yes, that is a somewhat inevitable side effect of making it be behind >> an API, and one I am looking forward to. There are almost just plain-data >> classes (but not using data classes per se) so we have two different >> classes — one that is the API representation, and an separate internal one >> used by scheduler etc that will have all of the scheduling logic methods. >> > >> > -ash >> > >> >> On 30 Aug 2024, at 17:55, Tzu-ping Chung <t...@astronomer.io.INVALID> >> wrote: >> >> >> >> >> >> >> >>> On 30 Aug 2024, at 17:48, Ash Berlin-Taylor <a...@apache.org> wrote: >> >>> >> >>> Where should DAG, TaskGroup, Labels, decorators etc for authoring be >> imported from inside the DAG files? Similarly for DagRun, TaskInstance >> (these two likely won’t be created directly by users, but just used for >> reference docs/type hints) >> >>> >> >> How about airflow.definitions? When discussing assets there’s a >> question raised on how we should call “DAG files” going forward (because >> those files now may not contain user-defined DAGs at all). “Definition >> files” was raised as a choice, but there’s no existing usage and it might >> be a bit to catch on. If we put all these things into airflow.definitions, >> maybe people will start using that term? >> >> >> >> As a side note, I wonder if we should do the user-internal separation >> better for DagRun and TaskInstance. We already have that separation for >> DAG/DagModel, Dataset/DatasetModel, and more. Maybe we should also have >> constructs that users only see, and are converted to “real” objects (i.e. >> exists in the db) for the scheduler. We already sort of have those in >> DagRunPydantic and TaskInstancePydantic, we just need to name them better >> and expose them at the right places. >> >> >> >> TP >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> >> For additional commands, e-mail: dev-h...@airflow.apache.org >> >> >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> > For additional commands, e-mail: dev-h...@airflow.apache.org >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> For additional commands, e-mail: dev-h...@airflow.apache.org >> >>