Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-10 Thread Ash Berlin-Taylor
Oh one thing to note: In order to get both mypy and IntelliJ working wit this, we needed to commit the typestubs, and in order to make Python not get confused by the src/airflow/_vendor/airflow_shared directory existing and treating it as a namespace package, so the loader installer code now lo

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-10 Thread Ash Berlin-Taylor
Not quite final PR, but good enough that I want to see how it behaves on CI, and other IDEs etc https://github.com/apache/airflow/pull/53149 (We updated the `setup_idea.py`, so either re-run that or add the new src root manually) Let the naming discussions start! -ash > On 10 Jul 2025, at 12:5

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-10 Thread Jarek Potiuk
Yeah. when we get the final PR I will also want to to test more scenarios - with IDE/mypy integration switching branches, uv syncing etc. and will be happy to help and document the contributor's doc to explain what and how to work with it. This would be a super cool thing if we get it to work seaml

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-09 Thread Amogh Desai
Agreed! Once the PR is up, we can have these implementation level discussions over there. Good chat however! Thanks & Regards, Amogh Desai On Wed, Jul 9, 2025 at 3:56 PM Jarek Potiuk wrote: > Yeah. I think extracting one-by-one, feature-by-feature that we want to > share to a separate distrib

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-09 Thread Jarek Potiuk
Yeah. I think extracting one-by-one, feature-by-feature that we want to share to a separate distribution is the best approach - it will actually also help with the "__init__.py" cleanup - because almost by definition - those distributions will not be able to "reach" outside - i.e. they only can be

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-09 Thread Amogh Desai
Probably, you make a valid point. Maybe this is an implementation detail, so we could figure it out as we start on a POC and factor in these things as we move along? But from an initial guess, I would think that execution time related items (if we manage to enumerate them) would be something that

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-08 Thread Jarek Potiuk
> Not that I am against your idea and we can surely expand as we need but we would not need to expand the "core_and_task_sdk" if we put only the relevant items into it. So if we move logging and config out, my question is what is really relevant to "stay" in "core_and_task_sdk" ? And what we know

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-08 Thread Amogh Desai
Yeah, I think what you are showcasing here is a step ahead of the initial proposal from Ash. >From the original proposal, the `core_and_task_sdk` *can* have the things relevant to just those two distros. Logging, Config are modules that might be needed by airflow-ctl for example, so ideally, those

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Jarek Potiuk
> @Jarek Potiuk a little confused on what you mean there, I am understanding the direction but could you elaborate a bit more please? Let me elaborate: As I understand (maybe I am wrong?), the proposal is that we have a "core-and-task-sdk" folder which is a shared distribution that is vendored-

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Amogh Desai
I like the folder structure proposed by Ash and have no objections with it. "core_and_task_sdk" sounds good to me and justifies what it should do pretty well. @Jarek Potiuk a little confused on what you mean there, I am understanding the direction but could you elaborate a bit more please? Nami

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Jarek Potiuk
How about splitting it even more and having each shared "thing" named? "logging", "config" and sharing them explicitly and separately with the right "user" ? That sounds way more modular and we will be able to choose which of the shared "utils" we use where. J. On Mon, Jul 7, 2025 at 11:13 PM J

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Jens Scheffler
I like "core_and_task_sdk the same like core-and-task-sdk - I have no problem and it is a path only. if we get to "dag-parser-scheduler-task-sdk-and-triggerer" which is a bit bulky we then should name it "all-not-api-server" :-D On 07.07.25 22:57, Ash Berlin-Taylor wrote: In case I did a bad

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Ash Berlin-Taylor
In case I did a bad job explaining it, the “core and task sdk” is not in the module name/import name, just in the file path. Anyone have other ideas? > On 7 Jul 2025, at 21:37, Buğra Öztürk wrote: > > Thanks Ash! Looks cool! I like the structure. This will enable all the > combinations and s

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Buğra Öztürk
Thanks Ash! Looks cool! I like the structure. This will enable all the combinations and structure looks easy to grasp. No strong stance on the naming other than maybe it is a bit long with `and`, `core_ctl` could be shorter, since no import path is defined like that, we can give any name for sure.

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Jarek Potiuk
Looks good but I think we should find some better logical name for core_and_sdk :) pon., 7 lip 2025, 21:44 użytkownik Jens Scheffler napisał: > Cool! Especially the "shared" folder with the ability to have > N-combinations w/o exploding project repo root! > > On 07.07.25 14:43, Ash Berlin-Taylor

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Jens Scheffler
Cool! Especially the "shared" folder with the ability to have N-combinations w/o exploding project repo root! On 07.07.25 14:43, Ash Berlin-Taylor wrote: Oh, and all of this will be explain in shared/README.md On 7 Jul 2025, at 13:41, Ash Berlin-Taylor wrote: Okay, so it seems we have agree

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Ash Berlin-Taylor
Oh, and all of this will be explain in shared/README.md > On 7 Jul 2025, at 13:41, Ash Berlin-Taylor wrote: > > Okay, so it seems we have agreement on the approach here, so I’ll continue > with this, and on the dev call it was mentioned that “airflow-common” wasn’t > a great name, so here is m

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-07 Thread Ash Berlin-Taylor
Okay, so it seems we have agreement on the approach here, so I’ll continue with this, and on the dev call it was mentioned that “airflow-common” wasn’t a great name, so here is my proposal for the file structure; ``` / task-sdk/... airflow-core/... shared/ kuberenetes/ pyproject.

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-04 Thread Jarek Potiuk
Yeah we have to try it and test - also building packages happens semi frequently when you run `uv sync` (they use some kind of heuristics to decide when) and you can force it with `--reinstall` or `--refresh`. Package build also happens every time when you run "ci-image build` now in breeze so it s

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-04 Thread Ash Berlin-Taylor
It’s not just release time, but any time we build a package which happens on “every” CI run. The normal unit tests will use code from airflow-common/src/airflow_common; the kube tests which build an image will build the dists and vendor in the code from that commit. There is only a single copy

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-04 Thread Amogh Desai
Thanks Ash. This is really cool and helpful that you were able to test both scenarios -- repo checkout and also installing from the vendored package and the resolution worked fine too. I like this idea compared the to relative import one for few reasons: - It feels like it will take some time to

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-04 Thread Ash Berlin-Taylor
Okay, I think I’ve got something that works and I’m happy with. https://github.com/astronomer/airflow/tree/shared-vendored-lib-tasksdk-and-core This produces the following from `uv build task-sdk` - https://github.com/user-attachments/files/21058976/apache_airflow_task_sdk-1.1.0.tar.gz - https

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-03 Thread Ash Berlin-Taylor
Oh yes, symlinks will work, with one big caveat: It does mean you can’t use absolute imports in one common module to another. For example https://github.com/apache/airflow/blob/4c66ebd06/airflow-core/src/airflow/utils/serve_logs.py#L41 where we have ``` from airflow.utils.module_loading import

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-03 Thread Pavankumar Gopidesu
Thanks Ash Yes agree option 2 would be preferred for me. Making sure we have all the gaurdriles to protect any unwanted behaviour in code sharing and executing right of tests between the packages. Agree with others, option 2 would be On Thu, Jul 3, 2025 at 10:02 AM Amogh Desai wrote: > Thanks

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-03 Thread Amogh Desai
Thanks for starting this discussion, Ash. I would prefer option 2 here with proper tooling to handle the code duplication at *release* time. It is best to have a dist that has all it needs in itself. Option 1 could very quickly get out of hand and if we decide to separate triggerer / dag processo

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Jens Scheffler
I'd also rather prefer option 2 - reason here is it is rather pragmatic and we no not need to cut another package and have less package counts and dependencies. I remember some time ago I was checking (together with Jarek, I am not sure anymore...) if the usage of symlinks would be possible. T

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Kaxil Naik
I prefer Option 2 as well to avoid matrix of dependencies On Thu, 3 Jul 2025 at 01:03, Jens Scheffler wrote: > I'd also rather prefer option 2 - reason here is it is rather pragmatic > and we no not need to cut another package and have less package counts > and dependencies. > > I remember some

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Jarek Potiuk
Yes =- we already use symlinking actually. For standard provider's examples added to airflow-core. And yes that is a good option for anything shared in 2) mode. When packaging , by default such symlinks are stored as files they point to. J. On Wed, Jul 2, 2025 at 9:33 PM Jens Scheffler wrote:

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Shahar Epstein
I support option 2 with proper automation & CI - the reasonings you've shown for that make sense to me. Shahar On Wed, Jul 2, 2025 at 3:36 PM Ash Berlin-Taylor wrote: > Hello everyone, > > As we work on finishing off the code-level separation of Task SDK and Core > (scheduler etc) we have com

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Jarek Potiuk
> My discussion was not so much about this specific use case of the logging config, but of sharing code in general which we know we need to do, as we currently have imports from core to sdk and from sdk to core, and we need to break that up. Yep. I understand that - and my answer is (and this is w

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Ash Berlin-Taylor
My discussion was not so much about this specific use case of the logging config, but of sharing code in general which we know we need to do, as we currently have imports from core to sdk and from sdk to core, and we need to break that up. -a > On 2 Jul 2025, at 14:44, Jarek Potiuk wrote: >

Re: Code sharing between Airflow Core and Task SDK - how do we achieve it

2025-07-02 Thread Jarek Potiuk
I think the answer **might** be different for different functionalities. For example I see that "config" is likely a better candidate for "shared distribution" than "logging" and for config we could use 1 where for logging we could use a form of 2. On a high level (And looking at the code). How mu