Hi Ash,
I was thinking hard... was setting the email aside and still have no
real _good_ ideas. I am still good with "models" and "sdk".
Actually what we want to define is an "execution interface" to which the
structual model as API in Python/or other language gives bindings and
helper methods. For the application it is around DAGs - but naming it
DAGs is not good because other non-DAG parts as side objects also need
to belong there.
Other terms which came into my mind were "Schema", "System" and "Plan"
but all of there are not as good as the previous "models" or "SDK".
API by the way is too brad and generic and smells like remote. So it
should _not_ be "API".
The term "Definitions" is a bit too long in my view.
So... TLDR... this email is not much of help other than saying that I'd
propose to use "airflow.models" or "airflow.sdk". If there are no other
/ better ideas coming :-D
Jens
On 30.08.24 19:03, Ash Berlin-Taylor wrote:
As a side note, I wonder if we should do the user-internal separation better
for DagRun and TaskInstance
Yes, that is a somewhat inevitable side effect of making it be behind an API,
and one I am looking forward to. There are almost just plain-data classes (but
not using data classes per se) so we have two different classes — one that is
the API representation, and an separate internal one used by scheduler etc that
will have all of the scheduling logic methods.
-ash
On 30 Aug 2024, at 17:55, Tzu-ping Chung <t...@astronomer.io.INVALID> wrote:
On 30 Aug 2024, at 17:48, Ash Berlin-Taylor <a...@apache.org> wrote:
Where should DAG, TaskGroup, Labels, decorators etc for authoring be imported
from inside the DAG files? Similarly for DagRun, TaskInstance (these two likely
won’t be created directly by users, but just used for reference docs/type hints)
How about airflow.definitions? When discussing assets there’s a question raised
on how we should call “DAG files” going forward (because those files now may
not contain user-defined DAGs at all). “Definition files” was raised as a
choice, but there’s no existing usage and it might be a bit to catch on. If we
put all these things into airflow.definitions, maybe people will start using
that term?
As a side note, I wonder if we should do the user-internal separation better
for DagRun and TaskInstance. We already have that separation for DAG/DagModel,
Dataset/DatasetModel, and more. Maybe we should also have constructs that users
only see, and are converted to “real” objects (i.e. exists in the db) for the
scheduler. We already sort of have those in DagRunPydantic and
TaskInstancePydantic, we just need to name them better and expose them at the
right places.
TP
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org