It has never been simpler to contribute to Airflow! Awesome job Jarek :)

On 2025/03/21 13:50:05 Jarek Potiuk wrote:
> Quick additional info - if you have in your repo a 'tests` or 'airflow'
> folder remaining in the root of the repo - because you had some extra files
> in those (for example generated node_modules)  - you should delete those
> two directories. They are now unused and any files remaining there can and
> *SHOULD* be deleted
> 
> pt., 21 mar 2025, 14:28 użytkownik Jarek Potiuk <ja...@potiuk.com> napisał:
> 
> > Ok. Now the "airflow-core" change is merged.
> >
> > Most important - *please rebase all your work now to the latest main*.
> > Most PR will have conflicts and will require to be rebased anyway, but you
> > will do you a favour if you do it manually first.
> >
> > Most likely those rebases will not work from the UI (they will just ask
> > you to do the rebase manual way and give some hints on how this can be done.
> >
> > If you have apache airflow repo set as remote, (I have 'apache' remote),
> > this can be usually done with:
> >
> > git fetch apache
> > git rebase --onto apache/main $(git merge-base)
> >
> > Of course you have to check it manually - but this one should take all the
> > commits you locally committed when you worked on your PR and 'transplant'
> > them on top of the main branch.
> >
> > Few things to take care of after:
> >
> > 1. Make sure to rebuild your breeze image:
> >
> > breeze ci-image build
> >
> > 2. Make sure to resync your uv .venv including reinstallation:
> >
> > uv self upgrade
> > uv sync --reinstall
> >
> > This one will update your venv and make sure it gets reinstalled with the
> > new packages and all necessary deps for core airflow.
> >
> > There are quite a few other variants of such sync you should be able to
> > use from now on:
> >
> > *Syncing airflow core minimum dev dependencies *
> >
> > uv sync
> >
> > This one will (after this change) install airflow core + all optional
> > dependencies of airflow + all pre installed providers locally (and their
> > dependencies) . Which means that it should allow to run all `airflow-core`
> > tests. In theory - we still have few tests in airflow that might require
> > other providers - to be cleaned up later. I will modify our CI later to
> > also run using those limited, isolated environments to keep it this way in
> > the future.
> >
> > You should be also able to run tests after regular activation of your venv
> > (. ./.ven/bin/activate) and this is where your IDE should also have your
> > python interpreter set - but uv has this cool `uv run` feature that allows
> > you to run any command with automated activation of the venv:
> >
> > uv run pytest airflow-core/tests/....
> >
> >
> > Also this should work out of the box:
> >
> > uv run airflow
> >
> > Go figure :)
> >
> >
> > *Syncing dependencies for particular provider (and other dependent
> > providers)*
> >
> > In the root of Airflow repo
> >
> > uv sync --package apache-airflow-providers-amazon
> >
> > This will sync amazon and all necessary development deps + all the
> > providers that amazon depends on, this way you **should** be able to run
> > all amazon provider tests (including transfers and all others) - what
> > Dennis asked about at the call yesterday.
> >
> > Similarly you can run your tests this way
> >
> > uv run --package apache-airflow-providers-amazon pytest
> > providers/amazon/unit/....
> >
> > *Alternative way of syncing provider dependencies *
> >
> > cd providers/amazon
> > uv sync
> >
> > In this case you should be able to also do this:
> >
> > uv run pytest tests/unit/
> >
> > You soon will be able to do the same in `airflow-core` - once the tests
> > that are expecting providers are removed from "airflow-core".
> >
> > cd airflow-core
> > uv sync
> >
> > That's about it. All the rest should not change, Breeze tests,
> > start-airflow etc. should work as usual.
> >
> >
> > *Syncing all dependencies*
> >
> > This is equivalent to what `breeze` image has. I do not really recommend
> > using it daily - syncing venv and swapping dependencies take sub-seconds
> > with *uv, *also you should really treat the .venv in your repo as
> > disposable and something you can easily resync any time.
> >
> > uv sync --all-packages
> >
> > This should allow you to run everything
> >
> > uv run --all-packages pytest ....
> >
> > Have fun!
> >
> > I am here and on slack `#contributors` later today. Shoot me with any
> > questions and problems - happy to help (and encourage to help each other
> > there too)
> >
> > *Bonus info*
> >
> > Actually you do not even need to do 'uv sync`. When you use uv run , uv 
> > automatically
> > runs uv sync under the hood (applying the --package switches as
> > appropriate) and you get the latest env resynced automatically !
> >
> > Actually it's even more - you do not need python installed at all when you
> > run `uv run` - uv will download and install (in seconds) the right version
> > of Python for you automatically !
> >
> > So really:
> >
> > * Install uv
> > * git clone
> > * uv run pytest
> >
> > Is absolutely all you need to start contributing to Airflow.
> >
> > And I absolutely love it. This has been 4 years in the making and it's
> > finally there!
> >
> > J
> >
> >
> >
> > On Thu, Mar 20, 2025 at 12:56 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> >> Ok. The PR https://github.com/apache/airflow/pull/47798 is "green"
> >> (minus failing main issue with microsoft libraries solved separately and
> >> randomly failing k8s tests that we are fighting with).
> >>
> >> I also added a description of the changes and happy to take any comments
> >> and reviews. Would be great to get it merged **right** after the beta
> >> release, to not disturb the release but also to get as many open PRs as
> >> possible before the merge to minimize the number of conflicts YOU will have
> >> to solve (at the expense of ME solving them :) ).
> >>
> >> I would like to have a small discussion afterwards on the exact way we
> >> will treat `uv sync` and dependencies - including pre-installed providers,
> >> but I would like to have this discussion later, I do not want to "muddy the
> >> waters" right now. After we merge it and get some teething problems sorted
> >> out, I will start a discussion thread about it. In short we can still
> >> decide and move around thing such as - how many extras are installed by
> >> default with `uv sync`, where we keep pre-installed providers definition -
> >> is it in `airflow-core` or `airflow` and whether we want to keep with `pip`
> >> way of doing things or can we entirely rely on `uv` for development (the
> >> latter would simplify some of the hatch_build_* logic and allow us to have
> >> more static dependency definition.
> >>
> >> But let's leave that discussion for next week. I will set the stage for
> >> it today at the dev call, before I send the email with a more detailed
> >> description of options and dependencies we have - but that should not stop
> >> the "small :) " PR of mine to be merged:
> >>
> >> [image: image.png]
> >>
> >> J.
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Mar 19, 2025 at 9:12 PM Shahar Epstein <sha...@apache.org> wrote:
> >>
> >>> That's hardcore (pun intended) :D
> >>> Great work and good luck merging it!
> >>>
> >>> On Fri, Mar 14, 2025 at 9:28 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >>>
> >>> > Hey here,
> >>> >
> >>> > I have a first (very draft and still requires a number of changes) PR
> >>> for
> >>> > the final step of big refactoring of our projects and using workspace.
> >>> This
> >>> > is to let you know about the changes coming (so please take a look at
> >>> the
> >>> > consequences to not be surprised).
> >>> >
> >>> > This is the most *scary* one -> moving all airflow code to
> >>> > "airflow-core". And I have  draft version of it in
> >>> > https://github.com/apache/airflow/pull/47798
> >>> >
> >>> > And it's not for the faint of heart :)
> >>> >
> >>> > [image: image.png]
> >>> >
> >>> > Note! It's not yet complete and unless you have some general comments,
> >>> > it's likely not worth pointing to individual changes (yet) - it's more
> >>> to
> >>> > take a look at how things will look like eventually. I will work in the
> >>> > next two days to get it to  reviewable state, and will keep it rebased
> >>> and
> >>> > running till mid next-week. I would like to have it ready (including
> >>> the
> >>> > release process) for the fourth (and final?) beta).
> >>> >
> >>> > Some resulting packaging changes:
> >>> >
> >>> > *FOR DEVELOPMENT:*
> >>> >
> >>> > * the pyproject.toml in the "root" of Airflow is still "apache-airflow"
> >>> > package - but this will be an empty "meta" package that will install
> >>> > together "apache-airflow-core", "apache-airflow-task-sdk" and
> >>> optionally
> >>> > providers (via extras)
> >>> >
> >>> > * the airflow-core is a new "apache-airflow-core" distribution, where
> >>> only
> >>> > airflow dependencies and airflow "core" extras are configured (smtp/
> >>> otel,
> >>> > pandas,rabbitmq etc) - I will likely cleanup some of those as well,
> >>> some of
> >>> > them are not needed. the nice thing is that this package has all
> >>> > dependencies static (no hatch_build.py - everything is in
> >>> pyproject.toml) -
> >>> > which is pretty cool and allow us to better use dependabot for security
> >>> > upgrades and notifications
> >>> >
> >>> > The airflow-core structure is pretty standard:
> >>> >
> >>> > airflow-core  # <- this is folder where airflow-core distribution is
> >>> >             \- src
> >>> >             |     \ airflow # <- This is airflow package
> >>> >             |             \- api
> >>> >             |             |- api_fastapi
> >>> >             |             |- assets
> >>> >             |             ...
> >>> >             |- tests
> >>> >             |       \- always
> >>> >             |       |- api
> >>> >             |       ...
> >>> >             |- docs
> >>> >             |
> >>> >             |- pyproject.toml
> >>> >             |- README.md
> >>> >
> >>> >
> >>> > * for development - i will describe later the `pypi` way, but with `uv`
> >>> > things get simpler and we have a few new options (Dennis - this is
> >>> > continuation of discussion on the uv sync commands, so it's worth to
> >>> > look closely:
> >>> >
> >>> > There are a number of ways you will be (eventually able to interact
> >>> with
> >>> > venv. After you checkout Airflow. You can change working directory and
> >>> work
> >>> > on different packages and depending on which directory you run `uv
> >>> sync` -
> >>> > uv (using workspace feature) will sync the **expected** dependencies.
> >>> >
> >>> > It's best to get used to the fact that instead of one airflow project
> >>> we
> >>> > will have ~100 pretty independent projects, and while you can continue
> >>> > working with all of them as a single huge "workspace", it is generally
> >>> way
> >>> > more convenient to change directory to the "distribution" you are
> >>> working
> >>> > on currently and do everything there - with isolated set of
> >>> dependencies
> >>> > required only for that "distribution" - "airflow-core", "task-sdk",
> >>> > "providers/amazon", "providers/mongo" - those are all separate
> >>> > distributions, and more and more we will be able to treat them as
> >>> > independent projects (but we will conveniently keep the option to
> >>> develop
> >>> > and run tests in a joined "workspace" environment at the top of the
> >>> project
> >>> > where we can install and test everything together - that's a bit of `uv
> >>> > workspace` magic in play.
> >>> >
> >>> > Here are typical patterns:
> >>> >
> >>> > 1) Installing all development dependencies for everything (I.e complete
> >>> > environment like in breeze)  -- allows to run all tests for all
> >>> airflow and
> >>> > all providers
> >>> >
> >>> > cd .
> >>> > uv sync --all-packages
> >>> >
> >>> > 2) installing just airflow core with required dependencies (ready for
> >>> most
> >>> > core tests)
> >>> >
> >>> > cd airflow-core
> >>> > uv sync
> >>> >
> >>> > 3) installing airflow core with optional dependencies (should allow to
> >>> run
> >>> > all core tests - including for the optional core features such as otel
> >>> etc).
> >>> >
> >>> > cd airflow-core
> >>> > uv sync --all-extras
> >>> >
> >>> > 4) installing individual provider dependencies (say amazon) - this
> >>> allows
> >>> > to run all tests of the provider you are working on - including
> >>> installing
> >>> > all dependencies from cross-provider dependencies (i.e. if you have
> >>> google
> >>> > tests in amazon provider, it will also install necessary google
> >>> > dependencies).
> >>> >
> >>> > cd providers/amazon
> >>> > uv sync
> >>> >
> >>> > Generally speaking - "airflow-core" will become (eventually) a truly
> >>> > airflow-only distribution. It will have a few dependencies to
> >>> "standard"
> >>> > and "fab" providers - but I hope we will be able to get rid of those
> >>> during
> >>> > the resulting cleanup.
> >>> >
> >>> > The IDE (IntelliJ) setting will just require "airflow-core/src" and
> >>> > "airflow-core/tests" to be source/test roots as usual for other
> >>> > distributions.
> >>> >
> >>> > I will update the docs after I complete the PR, there are some small
> >>> > variations on when to install which extras and I will play a bit to
> >>> get to
> >>> > the best developer experience and least surprises.
> >>> >
> >>> > *FOR USERS*
> >>> >
> >>> > For "installable" airflow (i.e. user's experience) - the changes will
> >>> be
> >>> > pretty much 100% transparent. When user will install "apache-airflow"
> >>> or
> >>> > "apache-airflow[google]" - things will work as they did before - only
> >>> > instead of one "apache-airflow" distribution, they will have
> >>> > "apache-airflow", "apache-airflow-core" and "apache-airflow-task-sdk"
> >>> > installed.
> >>> >
> >>> > Regarding version numbers etc., I will start a separate discussion -
> >>> later
> >>> > next week after we see how those packages will interact
> >>> ("apache-airflow"
> >>> > will only contain extras, but for compatibility reasons we likely want
> >>> to
> >>> > pin both "apache-airflow" and "apache-airflow-core" to each other, so
> >>> that
> >>> > users will be able to upgrade "core" by upgrading "apache-airflow" -
> >>> we do
> >>> > not want to change those habits likely.
> >>> >
> >>> > The "apache-airflow-task-sdk" will be versioned separately.
> >>> >
> >>> > Please take a look - also at the PR, see if you have any big
> >>> > issues/questions/doubts - let's start discussion here - I am happy to
> >>> > answer all general questions and adapt the PR to respond to
> >>> > questions/suggestions.
> >>> >
> >>> > In the meantime I will be working on making the PR green and adding
> >>> > missing bits and pieces for the release process.
> >>> >
> >>> > J.
> >>> >
> >>> >
> >>> >
> >>> >
> >>>
> >>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to