Ok. I described a bit more in slack discussions why repeating the folders is not a good idea (it's prone to tooling that might have different ways of handling PYTHONPATH for development).
But yeah - having the same name, for distribution and package makes sense. -> so maybe this will be a good structure: Maybe a good proposal will be to just : tests_common <- folder \pyproject.toml \src \airflow <-- Airflow Python package \ __init__.py <- legacy namespace package with __path__ = __import__("pkgutil").extend_path(__path__, __name__) \tests_common <- Python package That will avoid ambiguity in imports. But this change will be bigger (but only once) and `from airflow.tests_common import *` should be used everywhere. If there will be no objections, I will make a draft PR right after we remove the old UI. J. On Tue, Feb 18, 2025 at 6:28 PM Jarek Potiuk <ja...@potiuk.com> wrote: > BTW. After reading the proposed structure again python-subfolder -> that > is the biggest point to discuss it, and I would rather (at least now) first > move airflow_core to a sub-directory, not necessarily move all "python" > projects to a sub-folder. There is very little value in such grouping, and > even if we add new languages for sdks, at least initially, having them at > top-level as sub-folders, does not seem too problematic. But that should be > another discussion and conclusion as I said, let's incrementally change > what we have now to be "good pyproject.toml and workspace" support, and > then we can decide what to do next. once we do it, just moving folders will > be way easier. > > On Tue, Feb 18, 2025 at 6:20 PM Jarek Potiuk <ja...@potiuk.com> wrote: > >> > One thing I would like to avoid is having the `[test]` extra show up in >> the released packages though (it’s not important, just would be nice if we >> can avoid that) >> >> That is already not happening (and won't, all those devel extras in >> dependencies (e is gone at the moment I added hatchling, it handles dynamic >> extras removal for wheel packages, so almost all devel kind of extras have >> been removed back then - at the beginning of 2024 - I think it was Airflow >> 2.7? 2.8? can't remember). The only remaining extra in release package was >> "devel-ci" used exclusively for optimising our image building, but we can >> remove it in airflow 3 now, because that has been replaced with "stash" >> action usage and utilising it to reuse `uv` cache across the builds, so we >> do not need `devel-ci` any more. That is one of the next cleanups that are >> possible to be done now after all the "foundational" cleanups in the >> packaging :D. Airflow is getting cleaner :) >> >> Also - as explained before in the near future, we will be able to get rid >> of all the devel extras even for "editable" installs - once `pip` will get >> support for "dependency-groups" https://peps.python.org/pep-0735/ >> approved - we will be able to fully switch to dependency groups for both >> `uv` and `pip` installation, so the `devel` kind of extras will be gone >> completely - not only from wheels, but also from editable installations. >> And even more - next step, once https://peps.python.org/pep-0771/ Default >> Extras for Python Software Packages (currently Draft) will be accepted and >> implemented (this time it will likely just have to be implemented by >> hatchling backend or whatever backend we will choose) - then we will likely >> be able to simplify the dynamic dependencies we have now - and bring them >> back from being calculated in hatch_build.py to Airflow's >> pyproject.toml (actually likely we should be able to make >> optional-dependencies non-dynamic even before PEP-0771 is implemented - >> because I think dependency groups are enough to make our optional >> dependencies non-dynamic. >> >> > I wonder if now is also the time to move all the python code under a >> sub-folder >> >> Absolutely. That was absolutely my goal as well. And I did not need >> "dagster" inspiration :). it's a very natural consequence of all >> other moves (and yes `uv` supports both types of workspaces - one where the >> top project is in the root and one when the root is just the workspace with >> no code, or just a bootstrap, which is pretty cool). >> >> For now I was holding up this discussion to the next step, but yes, >> absolutely, that would be the next topic to discuss and absolutely the way >> to go. I think though it will require a bit more discussion on consequences >> and transition process - especially that we are in the middle of >> super-heavy modifications to Airflow-core, also some removals (old UI! >> YAY!) are about to happen that will make this move quite a bit easier. >> Moving tests_common as an incremental step will help us also to learn what >> it means - to move a "root" Python package to a "folder/src". I expect a >> few surprises, (so called "unknown unknowns") and I think it's better to >> solve them now with tests_common, which is far less "busy" than airflow, >> and then discussion about moving "airflow" to its own sub-folder would be >> much more informed. >> >> One thing-at-a-time is my motto here. We have at least a few dozen people >> actively contributing to airflow repo and whatever we break, we should >> break a little, incrementally. And we will break things, I am sure. >> >> > This extra top level python_modules folder might help with the >> PYTHONPATH issue? >> >> I am afraid not entirely. It might or might not and it will also depend >> very much on the tooling used by our contributors. It really depends on >> interpretation of tools, IDEs. For example IntelliJ/Pycharm will (as far as >> I know) always add content root to the PYTHONPATH, regardless, and there >> are also some default python behaviours (scripts adding parent folder of >> the script to PYTHONPATH for example) that are somewhat unexpected and I >> find using different "folder" name and "top package name" as the easiest >> and most robust way to prevent those difficult-to-debug things happening. >> I'd prefer to prevent those issues rather than having to diagnose and fix >> them :). >> >> J. >> >> >> >> >> On Tue, Feb 18, 2025 at 10:06 AM Ash Berlin-Taylor <a...@apache.org> >> wrote: >> >>> One thing I would like to avoid is having the `[test]` extra show up in >>> the released packages though (it’s not important, just would be nice if we >>> can avoid that) >>> >>> I wonder if now is also the time to move all the python code under a >>> sub-folder >>> >>> So something like this >>> >>> Python_modules/ >>> Airflow_core/ >>> Providers/ >>> task-sdk/ >>> Helm/ >>> Dev/ >>> Breeze/ >>> >>> Etc. (Casing not relevant, just auto-correct) >>> >>> This extra top level python_modules folder might help with the >>> PYTHONPATH issue? >>> >>> (Yes, I’m using dragster somewhat as inspiration for this >>> https://github.com/dagster-io/dagster/ ) >>> >>> >>> > On 18 Feb 2025, at 04:51, Amogh Desai <amoghdesai....@gmail.com> >>> wrote: >>> > >>> > +1 to this idea overall. >>> > >>> > A bit torn on naming it "common_test_code" -- no strong reason for it >>> but >>> > names like: `airflow_test_utils` or >>> > `airflow_test_shared` sound better to me. No strong objection though. >>> > >>> > Thanks & Regards, >>> > Amogh Desai >>> > >>> > >>> > On Mon, Feb 17, 2025 at 11:16 PM Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> > >>> >> Main reason is that this might avoid duplication and remove ambiguity >>> of >>> >> what is being imported. If we keep the same name, we will have to have >>> >> something like that: >>> >> >>> >> a) folder where project is >>> >> b) python package we import >>> >> >>> >> >>> >> So ...if we do tests_common, we will have to do: >>> >> >>> >> tests_common <- folder >>> >> \pyproject.toml >>> >> \src >>> >> \tests_common <- Python package >>> >> >>> >> >>> >> And when you do: >>> >> >>> >> import tests_common >>> >> >>> >> Depending on the tooling, PYTHONPATH, whether implicit python >>> packages are >>> >> enabled, WEIRD things might happen. We've seen it when we had "smtp" >>> >> provider name and it clashed with built-in package and that's why we >>> have >>> >> now "unit.smtp", "integrations.smtp" and "system.smtp" as "canonical >>> >> imports" to avoid this problem. >>> >> >>> >> This is similar - importing tests_common in this case might behave >>> >> differently if we keep the same name. That was my main motivation to >>> have a >>> >> "different" name - whether it's "common_test_code" or something else, >>> does >>> >> not matter, it just has to be significantly different. >>> >> >>> >> Another approach could be what we did in providers, add extra package >>> >> inside the project, and import "tests_common" with a package prefix >>> (for >>> >> example `airflow_tests`): >>> >> >>> >> >>> >> tests_common <- folder >>> >> \pyproject.toml >>> >> \src >>> >> \airflow_tests <- Python package >>> >> \tests_common <- Python package >>> >> >>> >> But then we would have to change all the tests that import >>> tests_common to: >>> >> >>> >> from `airflow_tests.tests_common` >>> >> >>> >> Easy to do, but I wanted to avoid another 1000+ files PR for our poor >>> >> reviewers :) >>> >> >>> >> So basically, I think we have two options: >>> >> >>> >> a) different "folder" name where we keep the project >>> >> b) adding parent package to "tests_common" >>> >> >>> >> We could do both, I am happy with either approach but maybe we should >>> do a >>> >> small unscientific poll, and have others propose better names (you >>> know, >>> >> naming is hard :D). >>> >> >>> >> J >>> >> >>> >> >>> >> pon., 17 lut 2025, 16:43 użytkownik Vincent Beck <vincb...@apache.org >>> > >>> >> napisał: >>> >> >>> >>> Overall +1 on this one. Regarding the naming, why not keeping >>> >>> "tests_common" instead of "common_test_code"? I am not a big fan of >>> >>> "common_test_code" but it is obviously just a personal opinion (as >>> it is >>> >>> always with naming :)) >>> >>> >>> >>> On 2025/02/16 13:30:09 Jarek Potiuk wrote: >>> >>>>> Just wondernig... would an optional dependency not be the right >>> place >>> >>> to >>> >>>> describe that apache-airflow-providers-google[tests] would have an >>> >>>> dependency to the common_tests subproject? >>> >>>>> Would mean you would need to install via >>> >>>>> pip install -e . -e ./task_sdk[tests] -e. ./providers/google[tests] >>> >>>> >>> >>>> Something like that. This is more of a details of workspace but we >>> are >>> >>>> going to use "dev" dependency group for that - rather than extra. >>> >> Details >>> >>>> to be worked out how pip interaction will look like (pip does not >>> have >>> >>>> support for dependency groups yet - they are coming in 25.1 - they >>> are >>> >>>> already merged in main) >>> >>>> >>> >>>> With `uv` dependency groups can be used today and "dev" dependency >>> >> group >>> >>> is >>> >>>> installed automatically when you run uv sync or `uv pip install >>> -e.` -> >>> >>> so >>> >>>> we will follow this along >>> >>>> >>> >>>> See details about "development dependencies" in uv here >>> >>>> >>> >>> >>> >> >>> https://docs.astral.sh/uv/concepts/projects/dependencies/#development-dependencies >>> >>>> >>> >>>> Dependency groups are already approved via PEP-735 >>> >>>> https://peps.python.org/pep-0735/ and as mentioned - we are a >>> release >>> >>> away >>> >>>> from having them released in `pip`, so for now we will have to >>> emulate >>> >> it >>> >>>> with extras for pip case I think but we will be able to remove it >>> when >>> >>> pip >>> >>>> 25.1 is released and matures a bit. >>> >>>> >>> >>>> J. >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Sun, Feb 16, 2025 at 2:23 PM Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> >>>> >>> >>>>>> I would love to see some airflow_testing package which will be >>> >>> useful for >>> >>>>> testing airflow-related projects and involve independently. >>> >>>>> >>> >>>>>> Certainly, it's not a good thing to have tests import something >>> >> from >>> >>>>> tests. >>> >>>>> New packages as projects are cheap and provide more flexibility and >>> >> are >>> >>>>> useful from outside of the project. >>> >>>>> >>> >>>>> I already explained in the PR, but let me repeat my opinion here: >>> >>>>> >>> >>>>> IMHO It's extremely unlikely we are going to release and publish >>> the >>> >>>>> common test code / fixtures in any way. They will continue to be in >>> >>>>> development-only-distribution and they will be treated as "internal >>> >>> detail". >>> >>>>> >>> >>>>> If we decide to release and publish them, we will have to maintain >>> >>>>> backwards compatibility and account for our users (like you) using >>> >>> them for >>> >>>>> their own purpose. That would block us or make it very difficult to >>> >>> make >>> >>>>> breaking changes in them. >>> >>>>> >>> >>>>> So while you will be free to continue copying the whole >>> distribution >>> >>> and >>> >>>>> use it in your tests as you want (our licence allows that) - I >>> >>> seriously >>> >>>>> doubt we will ever release and publish it in "reusable form" with >>> >>>>> "compatibility guarantees". It's far more efficient if people like >>> >> you >>> >>> just >>> >>>>> copy them and are aware that they can change any time. >>> >>>>> >>> >>>>> J. >>> >>>>> >>> >>>>> On Sun, Feb 16, 2025 at 2:21 PM Alexander Shorin < >>> kxe...@apache.org> >>> >>>>> wrote: >>> >>>>> >>> >>>>>> I would love to see some airflow_testing package which will be >>> >> useful >>> >>> for >>> >>>>>> testing airflow-related projects and involve independently. >>> >>>>>> >>> >>>>>> Certainly, it's not a good thing to have tests import something >>> from >>> >>>>>> tests. >>> >>>>>> New packages as projects are cheap and provide more flexibility >>> and >>> >>> are >>> >>>>>> useful from outside of the project. >>> >>>>>> >>> >>>>>> Also this project could have a future for testing, compatibility, >>> >>> quality >>> >>>>>> and rest of measuring. >>> >>>>>> >>> >>>>>> -- >>> >>>>>> ,,,^..^,,, >>> >>>>>> >>> >>>>>> >>> >>>>>> On Sun, Feb 16, 2025 at 4:12 PM Jarek Potiuk <ja...@potiuk.com> >>> >>> wrote: >>> >>>>>> >>> >>>>>>> Hello here, >>> >>>>>>> >>> >>>>>>> Next phase of the cleanup - it's been sped up by the comment >>> from >>> >>>>>> @kxepal >>> >>>>>>> - >>> >>> https://github.com/apache/airflow/pull/46801#issuecomment-2661415731 >>> >>>>>> - >>> >>>>>>> I >>> >>>>>>> have planned to do it a bit later this week, but maybe indeed >>> >> it's a >>> >>>>>> good >>> >>>>>>> idea to start a discussion now so that people are not confused. >>> >>>>>>> >>> >>>>>>> Currently we are using "from tests_common" to reuse code in >>> >> various >>> >>>>>>> providers and airflow, and this is fine, with the exception that >>> >>>>>>> "tests_common" is currently just a package in "airflow" main >>> >>> project. >>> >>>>>> But >>> >>>>>>> it does not have to be (or rather it should not be). >>> >>>>>>> >>> >>>>>>> We should have it in a separate sub-project - similarly as >>> >>> providers/* >>> >>>>>> and >>> >>>>>>> task_sdk are now separate projects - and we should add dependency >>> >>> to the >>> >>>>>>> common test code distribution from within all the projects that >>> >> use >>> >>> it >>> >>>>>> (via >>> >>>>>>> workspace feature). >>> >>>>>>> >>> >>>>>>> My proposal is to name it "common_test_code" (but I am open to >>> any >>> >>>>>>> suggestions): It will look like this >>> >>>>>>> >>> >>>>>>> . >>> >>>>>>> |- airflow >>> >>>>>>> |- common_test_code >>> >>>>>>> | pyproject.toml >>> >>>>>>> | src >>> >>>>>>> | tests_common >>> >>>>>>> >>> >>>>>>> The code in airflow and providers will not change, they will >>> >>> continue to >>> >>>>>>> use "from tests_common import ...". >>> >>>>>>> >>> >>>>>>> * uv will work as it used to work, no changes will be needed - uv >>> >>> sync >>> >>>>>> will >>> >>>>>>> automatically install the common_test_code as editable project >>> >>>>>>> >>> >>>>>>> * additionally - (and that's a big plus) after this change you >>> >>> should be >>> >>>>>>> able to run `uv sync` in a provider folder and have >>> >>> provider-specific >>> >>>>>>> separate venv, and basically you should be able to run tests for >>> a >>> >>>>>> provider >>> >>>>>>> using that .venv and generally treat provider project as a >>> >>> "standalone" >>> >>>>>> one >>> >>>>>>> - this is what I hinted at in the previous mail. >>> >>>>>>> >>> >>>>>>> * people who do not use uv but pip for example will have to >>> >>> manually add >>> >>>>>>> the `common_test_code` as an editable install in their venv. For >>> >>>>>> example: >>> >>>>>>> >>> >>>>>>> pip install -e . -e ./task_sdk -e. ./providers/google -e >>> >>>>>> ./common_test_code >>> >>>>>>> >>> >>>>>>> This is the next step in standardizing the layout of the Airflow >>> >>> project >>> >>>>>>> using workspaces. >>> >>>>>>> >>> >>>>>>> J. >>> >>>>>>> >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> >>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>> >>> For additional commands, e-mail: dev-h...@airflow.apache.org >>> >>> >>> >>> >>> >> >>> >>>