How about splitting it even more and having each shared "thing" named?
"logging", "config" and sharing them explicitly and separately with the
right "user" ?
That sounds way more modular and  we will be able to choose which of the
shared "utils" we use where.

J.


On Mon, Jul 7, 2025 at 11:13 PM Jens Scheffler <j_scheff...@gmx.de.invalid>
wrote:

> I like "core_and_task_sdk the same like core-and-task-sdk - I have no
> problem and it is a path only.
>
> if we get to "dag-parser-scheduler-task-sdk-and-triggerer" which is a
> bit bulky we then should name it "all-not-api-server" :-D
>
> On 07.07.25 22:57, Ash Berlin-Taylor wrote:
> > In case I did a bad job explaining it, the “core and task sdk” is not in
> the module name/import name, just in the file path.
> >
> > Anyone have other ideas?
> >
> >> On 7 Jul 2025, at 21:37, Buğra Öztürk <ozturkbugr...@gmail.com> wrote:
> >>
> >> Thanks Ash! Looks cool! I like the structure. This will enable all the
> >> combinations and structure looks easy to grasp. No strong stance on the
> >> naming other than maybe it is a bit long with `and`, `core_ctl` could be
> >> shorter, since no import path is defined like that, we can give any name
> >> for sure.
> >>
> >> Best regards,
> >>
> >>> On Mon, 7 Jul 2025, 21:51 Jarek Potiuk, <ja...@potiuk.com> wrote:
> >>>
> >>> Looks good but I think we should find some better logical name for
> >>> core_and_sdk :)
> >>>
> >>> pon., 7 lip 2025, 21:44 użytkownik Jens Scheffler
> >>> <j_scheff...@gmx.de.invalid> napisał:
> >>>
> >>>> Cool! Especially the "shared" folder with the ability to have
> >>>> N-combinations w/o exploding project repo root!
> >>>>
> >>>> On 07.07.25 14:43, Ash Berlin-Taylor wrote:
> >>>>> Oh, and all of this will be explain in shared/README.md
> >>>>>
> >>>>>> On 7 Jul 2025, at 13:41, Ash Berlin-Taylor <a...@apache.org> wrote:
> >>>>>>
> >>>>>> Okay, so it seems we have agreement on the approach here, so I’ll
> >>>> continue with this, and on the dev call it was mentioned that
> >>>> “airflow-common” wasn’t a great name, so here is my proposal for the
> file
> >>>> structure;
> >>>>>> ```
> >>>>>> /
> >>>>>>   task-sdk/...
> >>>>>>   airflow-core/...
> >>>>>>   shared/
> >>>>>>     kuberenetes/
> >>>>>>       pyproject.toml
> >>>>>>       src/
> >>>>>>         airflow_kube/__init__.py
> >>>>>>     core-and-tasksdk/
> >>>>>>       pyproject.toml
> >>>>>>       src/
> >>>>>>         airflow_shared/__init__.py
> >>>>>> ```
> >>>>>>
> >>>>>> Things to note here: the “shared” folder has (the possibility) of
> >>>> having multiple different shared “libraries” in it, in this example I
> am
> >>>> supposing a hypothetical shared kuberenetes folder a world in which we
> >>>> split the KubePodOperator and the KubeExecutor in to two separate
> >>>> distributions (example only, not proposing we do that right now, and
> that
> >>>> will be a separate discussion)
> >>>>>> The other things to note here:
> >>>>>>
> >>>>>>
> >>>>>> - the folder name in shared aims to be “self-documenting”, hence the
> >>>> verbose “core-and-tasksdk” to say where the shared library is
> intended to
> >>>> be used.
> >>>>>> - the python module itself should almost always have an `airflow_`
> (or
> >>>> maybe `_airflow_`?) prefix so that it does not conflict with anything
> >>> else
> >>>> we might use. It won’t matter “in production” as those will be
> vendored
> >>> in
> >>>> to be imported as `airflow/_vendor/airflow_shared` etc, but avoiding
> >>>> conflicts at dev time with the Finder approach is a good safety
> measure.
> >>>>>> I will start making a real PR for this proposal now, but I’m open to
> >>>> feedback (either here, or in the PR when I open it)
> >>>>>> -ash
> >>>>>>
> >>>>>>> On 4 Jul 2025, at 16:55, Jarek Potiuk <ja...@potiuk.com> wrote:
> >>>>>>>
> >>>>>>> Yeah we have to try it and test - also building packages happens
> semi
> >>>>>>> frequently when you run `uv sync` (they use some kind of heuristics
> >>> to
> >>>>>>> decide when) and you can force it with `--reinstall` or
> `--refresh`.
> >>>>>>> Package build also happens every time when you run "ci-image build`
> >>>> now in
> >>>>>>> breeze so it seems like it will nicely integrate in our workflows.
> >>>>>>>
> >>>>>>> Looks really cool Ash.
> >>>>>>>
> >>>>>>> On Fri, Jul 4, 2025 at 5:14 PM Ash Berlin-Taylor <a...@apache.org>
> >>>> wrote:
> >>>>>>>> It’s not just release time, but any time we build a package which
> >>>> happens
> >>>>>>>> on “every” CI run. The normal unit tests will use code from
> >>>>>>>> airflow-common/src/airflow_common; the kube tests which build an
> >>>> image will
> >>>>>>>> build the dists and vendor in the code from that commit.
> >>>>>>>>
> >>>>>>>> There is only a single copy of the shared code committed to the
> >>> repo,
> >>>> so
> >>>>>>>> there is never anything to synchronise.
> >>>>>>>>
> >>>>>>>>> On 4 Jul 2025, at 15:53, Amogh Desai <amoghdesai....@gmail.com>
> >>>> wrote:
> >>>>>>>>> Thanks Ash.
> >>>>>>>>>
> >>>>>>>>> This is really cool and helpful that you were able to test both
> >>>> scenarios
> >>>>>>>>> -- repo checkout
> >>>>>>>>> and also installing from the vendored package and the resolution
> >>>> worked
> >>>>>>>>> fine too.
> >>>>>>>>>
> >>>>>>>>> I like this idea compared the to relative import one for few
> >>> reasons:
> >>>>>>>>> - It feels like it will take some time to adjust to the new
> coding
> >>>>>>>> standard
> >>>>>>>>> that we will lay
> >>>>>>>>> if we impose relative imports in the shared dist
> >>>>>>>>> - We can continue using repo wise absolute import standards, it
> is
> >>>> also
> >>>>>>>>> much easier for situations
> >>>>>>>>> when we do global search in IDE to find + replace, this could
> mean
> >>> a
> >>>>>>>> change
> >>>>>>>>> there
> >>>>>>>>> - The vendoring work is a proven and established paradigm across
> >>>> projects
> >>>>>>>>> and would
> >>>>>>>>> out of box give us the build tooling we need also
> >>>>>>>>>
> >>>>>>>>> Nothing too against the relative import but with the evidence
> >>>> provided
> >>>>>>>>> above, vendored approach
> >>>>>>>>> seems to only do us good.
> >>>>>>>>>
> >>>>>>>>> Regarding synchronizing it, release time should be fine as long
> as
> >>> we
> >>>>>>>> have
> >>>>>>>>> a good CI workflow to probably
> >>>>>>>>> catch such issues per PR if changes are made in shared dist?
> >>>> (precommit
> >>>>>>>>> would make it really slow i guess)
> >>>>>>>>>
> >>>>>>>>> If we can run our tests with vendored code we should be mostly
> >>>> covered.
> >>>>>>>>> Good effort all!
> >>>>>>>>>
> >>>>>>>>> Thanks & Regards,
> >>>>>>>>> Amogh Desai
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On Fri, Jul 4, 2025 at 7:23 PM Ash Berlin-Taylor <
> a...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>> Okay, I think I’ve got something that works and I’m happy with.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>
> https://github.com/astronomer/airflow/tree/shared-vendored-lib-tasksdk-and-core
> >>>>>>>>>> This produces the following from `uv build task-sdk`
> >>>>>>>>>> -
> >>>>>>>>>>
> >>>
> https://github.com/user-attachments/files/21058976/apache_airflow_task_sdk-1.1.0.tar.gz
> >>>>>>>>>> -
> >>>>>>>>>>
> >>>
> https://github.com/user-attachments/files/21058996/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip
> >>>>>>>>>> (`.whl.zip` as GH won't allow .whl upload, but will .zip)
> >>>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>> ❯ unzip -l
> >>> dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl.zip |
> >>>>>>>> grep
> >>>>>>>>>> _vendor
> >>>>>>>>>>      50  02-02-2020 00:00   airflow/sdk/_vendor/.gitignore
> >>>>>>>>>>    2082  02-02-2020 00:00   airflow/sdk/_vendor/__init__.py
> >>>>>>>>>>      28  02-02-2020 00:00
>  airflow/sdk/_vendor/airflow_common.pyi
> >>>>>>>>>>      18  02-02-2020 00:00   airflow/sdk/_vendor/vendor.txt
> >>>>>>>>>>     785  02-02-2020 00:00
> >>>>>>>>>> airflow/sdk/_vendor/airflow_common/__init__.py
> >>>>>>>>>>   10628  02-02-2020 00:00
> >>>>>>>>>> airflow/sdk/_vendor/airflow_common/timezone.py
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> And similarly in the .tar.gz, so our “sdist” is complete too:
> >>>>>>>>>> ```
> >>>>>>>>>> ❯ tar -tzf dist/apache_airflow_task_sdk-1.1.0.tar.gz |grep
> _vendor
> >>>>>>>>>> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/.gitignore
> >>>>>>>>>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/__init__.py
> >>>>>>>>>>
> >>>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common.pyi
> >>>>>>>>>> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/vendor.txt
> >>>>>>>>>>
> >>>>>>>>>>
> >>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/__init__.py
> >>>
> apache_airflow_task_sdk-1.1.0/src/airflow/sdk/_vendor/airflow_common/timezone.py
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>> The plugin works at build time by including/copying the libs
> >>>> specified
> >>>>>>>> in
> >>>>>>>>>> vendor.txt into place (and let `vendoring` take care of import
> >>>>>>>> rewrites.)
> >>>>>>>>>> For the imports to continue to work at “dev” time/from a repo
> >>>> checkout,
> >>>>>>>> I
> >>>>>>>>>> have added a import finder to `sys.meta_path`, and since it’s at
> >>> the
> >>>>>>>> end of
> >>>>>>>>>> the list it will only be used if the normal import can’t find
> >>>> things.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>
> https://github.com/astronomer/airflow/blob/996817782be6071b306a87af9f36fe1cf2d3aaa3/task-sdk/src/airflow/sdk/_vendor/__init__.py
> >>>>>>>>>> This doesn’t quite give us the same runtime effect “import
> >>>> rewriting”
> >>>>>>>>>> affect, as in this approach `airflow_common` is directly loaded
> >>>> (i.e.
> >>>>>>>>>> airflow.sdk._vendor.airflow_common and airflow_common exist in
> >>>>>>>>>> sys.modules), but it does work for everything that I was able to
> >>>> test..
> >>>>>>>>>> I tested it with the diff at the end of this message. My test
> >>>> ipython
> >>>>>>>>>> shell:
> >>>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>> In [1]: from airflow.sdk._vendor.airflow_common.timezone import
> >>> foo
> >>>>>>>>>> In [2]: foo
> >>>>>>>>>> Out[2]: 1
> >>>>>>>>>>
> >>>>>>>>>> In [3]: import airflow.sdk._vendor.airflow_common
> >>>>>>>>>>
> >>>>>>>>>> In [4]: import airflow.sdk._vendor.airflow_common.timezone
> >>>>>>>>>>
> >>>>>>>>>> In [5]: airflow.sdk._vendor.airflow_common.__file__
> >>>>>>>>>> Out[5]:
> >>>>>>>>>>
> >>>
> '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/__init__.py'
> >>>>>>>>>> In [6]: airflow.sdk._vendor.airflow_common.timezone.__file__
> >>>>>>>>>> Out[6]:
> >>>>>>>>>>
> >>>
> '/Users/ash/code/airflow/airflow/airflow-common/src/airflow_common/timezone.py'
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> And in an standalone environment with the SDK dist I built (it
> >>>> needed
> >>>>>>>> the
> >>>>>>>>>> matching airflow-core right now, but that is nothing to do with
> >>> this
> >>>>>>>>>> discussion):
> >>>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>> ❯ _AIRFLOW__AS_LIBRARY=1 uvx --python 3.12 --with
> >>>>>>>>>> dist/apache_airflow_core-3.1.0-py3-none-any.whl --with
> >>>>>>>>>> dist/apache_airflow_task_sdk-1.1.0-py3-none-any.whl ipython
> >>>>>>>>>> Python 3.12.7 (main, Oct 16 2024, 07:12:08) [Clang 18.1.8 ]
> >>>>>>>>>> Type 'copyright', 'credits' or 'license' for more information
> >>>>>>>>>> IPython 9.4.0 -- An enhanced Interactive Python. Type '?' for
> >>> help.
> >>>>>>>>>> Tip: You can use `%hist` to view history, see the options with
> >>>>>>>> `%history?`
> >>>>>>>>>> In [1]: import airflow.sdk._vendor.airflow_common.timezone
> >>>>>>>>>>
> >>>>>>>>>> In [2]: airflow.sdk._vendor.airflow_common.timezone.__file__
> >>>>>>>>>> Out[2]:
> >>>>>>>>>>
> >>>
> '/Users/ash/.cache/uv/archive-v0/WWq6r65aPto2eJOyPObEH/lib/python3.12/site-packages/airflow/sdk/_vendor/airflow_common/timezone.py’
> >>>>>>>>>> ``
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ```diff
> >>>>>>>>>> diff --git a/airflow-common/src/airflow_common/__init__.py
> >>>>>>>>>> b/airflow-common/src/airflow_common/__init__.py
> >>>>>>>>>> index 13a83393a9..927b7c6b61 100644
> >>>>>>>>>> --- a/airflow-common/src/airflow_common/__init__.py
> >>>>>>>>>> +++ b/airflow-common/src/airflow_common/__init__.py
> >>>>>>>>>> @@ -14,3 +14,5 @@
> >>>>>>>>>> # KIND, either express or implied.  See the License for the
> >>>>>>>>>> # specific language governing permissions and limitations
> >>>>>>>>>> # under the License.
> >>>>>>>>>> +
> >>>>>>>>>> +foo = 1
> >>>>>>>>>> diff --git a/airflow-common/src/airflow_common/timezone.py
> >>>>>>>>>> b/airflow-common/src/airflow_common/timezone.py
> >>>>>>>>>> index 340b924c66..58384ef20f 100644
> >>>>>>>>>> --- a/airflow-common/src/airflow_common/timezone.py
> >>>>>>>>>> +++ b/airflow-common/src/airflow_common/timezone.py
> >>>>>>>>>> @@ -36,6 +36,9 @@ _PENDULUM3 =
> >>>>>>>>>> version.parse(metadata.version("pendulum")).major == 3
> >>>>>>>>>> # - FixedTimezone(0, "UTC") in pendulum 2
> >>>>>>>>>> utc = pendulum.UTC
> >>>>>>>>>>
> >>>>>>>>>> +
> >>>>>>>>>> +from airflow_common import foo
> >>>>>>>>>> +
> >>>>>>>>>> TIMEZONE: Timezone
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ```
> >>>>>>>>>>
> >>>>>>>>>>>> On 3 Jul 2025, at 12:43, Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >>>>>>>>>>> I think both approaches are doable:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) -> We can very easily prevent bad imports by pre-commit when
> >>>>>>>> importing
> >>>>>>>>>>> from different distributions and make sure we are only doing
> >>>> relative
> >>>>>>>>>>> imports in the shared modules. We are doing plenty of this
> >>>> already. And
> >>>>>>>>>> yes
> >>>>>>>>>>> it would require relative links we currently do not allow.
> >>>>>>>>>>>
> >>>>>>>>>>> 2) -> has one disadvantage that someone at some point in time
> >>> will
> >>>> have
> >>>>>>>>>> to
> >>>>>>>>>>> decide to synchronize this and if it happens just before
> release
> >>>> (I bet
> >>>>>>>>>>> this is going to happen) this will lead to solving problems
> that
> >>>> would
> >>>>>>>>>>> normally be solved during PR when you make a change (i.e.
> >>> symbolic
> >>>> link
> >>>>>>>>>> has
> >>>>>>>>>>> the advantage that whoever modifies shared code will be
> >>> immediately
> >>>>>>>>>>> notified in their PR - that they broke something because either
> >>>> static
> >>>>>>>>>>> checks or mypy or tests fail.
> >>>>>>>>>>>
> >>>>>>>>>>> Ash, do you have an idea of a process (who and when) does the
> >>>>>>>>>>> synchronisation in case of vendoring? Maybe we could solve it
> if
> >>>> it is
> >>>>>>>>>> done
> >>>>>>>>>>> more frequently and with some regularity? We could potentially
> >>>> force
> >>>>>>>>>>> re-vendoring at PR time as well any time shared code changes
> (and
> >>>>>>>> prevent
> >>>>>>>>>>> it by pre-commit. And I can't think of some place (other than
> >>>> releases)
> >>>>>>>>>> in
> >>>>>>>>>>> our development workflow and that seems to be a bit too late as
> >>>> puts an
> >>>>>>>>>>> extra effort on fixing potential incompatibilities introduced
> on
> >>>>>>>> release
> >>>>>>>>>>> manager and delays the release. WDYT?
> >>>>>>>>>>>
> >>>>>>>>>>> Re: relative links. I think for a shared library we could
> >>>> potentially
> >>>>>>>>>> relax
> >>>>>>>>>>> this and allow them (and actually disallow absolute links in
> the
> >>>> pieces
> >>>>>>>>>> of
> >>>>>>>>>>> code that are shared - again, by pre-commit). As I recall, the
> >>> only
> >>>>>>>>>> reason
> >>>>>>>>>>> we forbade the relative links is because of how we are (or
> maybe
> >>>> were)
> >>>>>>>>>>> doing DAG parsing and failures resulting from it. So we decided
> >>> to
> >>>> just
> >>>>>>>>>> not
> >>>>>>>>>>> allow it to keep consistency. The way how Dag parsing works is
> >>> that
> >>>>>>>> when
> >>>>>>>>>>> you are using importlib to read the Dag from a file, the
> relative
> >>>>>>>> imports
> >>>>>>>>>>> do not work as it does not know what they should be relative
> to.
> >>>> But if
> >>>>>>>>>>> relative import is done from an imported package, it should be
> no
> >>>>>>>>>> problem,
> >>>>>>>>>>> I think - otherwise our Dags would not be able to import any
> >>>> library
> >>>>>>>> that
> >>>>>>>>>>> uses relative imports.
> >>>>>>>>>>>
> >>>>>>>>>>> Of course consistency might be the reason why we do not want to
> >>>>>>>> introduce
> >>>>>>>>>>> relative imports. I don't see it as an issue if it is guarded
> by
> >>>>>>>>>> pre-commit
> >>>>>>>>>>> though.
> >>>>>>>>>>>
> >>>>>>>>>>> J.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> J.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> czw., 3 lip 2025, 12:11 użytkownik Ash Berlin-Taylor <
> >>>> a...@apache.org>
> >>>>>>>>>>> napisał:
> >>>>>>>>>>>
> >>>>>>>>>>>> Oh yes, symlinks will work, with one big caveat: It does mean
> >>> you
> >>>>>>>> can’t
> >>>>>>>>>>>> use absolute imports in one common module to another.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For example
> >>>>>>>>>>>>
> >>>
> https://github.com/apache/airflow/blob/4c66ebd06/airflow-core/src/airflow/utils/serve_logs.py#L41
> >>>>>>>>>>>> where we have
> >>>>>>>>>>>>
> >>>>>>>>>>>> ```
> >>>>>>>>>>>> from airflow.utils.module_loading import import_string
> >>>>>>>>>>>> ```
> >>>>>>>>>>>>
> >>>>>>>>>>>> if we want to move serve_logs into this common lib that is
> then
> >>>>>>>>>> symlinked
> >>>>>>>>>>>> then we wouldn’t be able to have `from
> >>>> airflow_common.module_loading
> >>>>>>>>>> import
> >>>>>>>>>>>> import_string`.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I can think of two possible solutions here.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1) is to allow/require relative imports in this shared lib, so
> >>>> `from
> >>>>>>>>>>>> .module_loading import import_string`
> >>>>>>>>>>>> 2) is to use `vendoring`[1] (from the pip maintainers) which
> >>> will
> >>>>>>>> handle
> >>>>>>>>>>>> import-rewriting for us.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I’d entirely forgot that symlinks in repos was a thing, so I
> >>>> prepared
> >>>>>>>> a
> >>>>>>>>>>>> minimal POC/demo of what vendoring approach could look like
> here
> >>>>>>>>>>>>
> >>>
> https://github.com/apache/airflow/commit/996817782be6071b306a87af9f36fe1cf2d3aaa3
> >>>>>>>>>>>> Now personally I am more than happy with relative imports, but
> >>>>>>>> generally
> >>>>>>>>>>>> as a project we have avoided them, so I think that limits what
> >>> we
> >>>>>>>> could
> >>>>>>>>>> do
> >>>>>>>>>>>> with a symlink based approach.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -ash
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1] https://github.com/pradyunsg/vendoring
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On 3 Jul 2025, at 10:30, Pavankumar Gopidesu <
> >>>>>>>> gopidesupa...@gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>> Thanks Ash
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Yes agree option 2 would be preferred for me. Making sure we
> >>>> have all
> >>>>>>>>>> the
> >>>>>>>>>>>>> gaurdriles to protect any unwanted behaviour in code sharing
> >>> and
> >>>>>>>>>>>> executing
> >>>>>>>>>>>>> right of tests between the packages.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Agree with others, option 2 would be
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Jul 3, 2025 at 10:02 AM Amogh Desai <
> >>>>>>>> amoghdesai....@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for starting this discussion, Ash.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I would prefer option 2 here with proper tooling to handle
> the
> >>>> code
> >>>>>>>>>>>>>> duplication at *release* time.
> >>>>>>>>>>>>>> It is best to have a dist that has all it needs in itself.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Option 1 could very quickly get out of hand and if we decide
> >>> to
> >>>>>>>>>> separate
> >>>>>>>>>>>>>> triggerer /
> >>>>>>>>>>>>>> dag processor / config etc etc as separate packages, back
> >>>> compat is
> >>>>>>>>>>>> going
> >>>>>>>>>>>>>> to be a nightmare
> >>>>>>>>>>>>>> and will bite us harder than we anticipate.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks & Regards,
> >>>>>>>>>>>>>> Amogh Desai
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Jul 3, 2025 at 1:12 AM Kaxil Naik <
> >>> kaxiln...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> I prefer Option 2 as well to avoid matrix of dependencies
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, 3 Jul 2025 at 01:03, Jens Scheffler
> >>>>>>>>>> <j_scheff...@gmx.de.invalid
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'd also rather prefer option 2 - reason here is it is
> >>> rather
> >>>>>>>>>>>> pragmatic
> >>>>>>>>>>>>>>>> and we no not need to cut another package and have less
> >>>> package
> >>>>>>>>>> counts
> >>>>>>>>>>>>>>>> and dependencies.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I remember some time ago I was checking (together with
> >>> Jarek,
> >>>> I am
> >>>>>>>>>> not
> >>>>>>>>>>>>>>>> sure anymore...) if the usage of symlinks would be
> possible.
> >>>> To
> >>>>>>>> keep
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> source in one package but "symlink" it into another. If
> then
> >>>> at
> >>>>>>>>>> point
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>> packaging/release the files are materialized we have 1 set
> >>> of
> >>>>>>>> code.
> >>>>>>>>>>>>>>>> Otherwise if not possible still the redundancy could be
> >>>> solved by
> >>>>>>>> a
> >>>>>>>>>>>>>>>> pre-commit hook - and in Git the files are de-duplicated
> >>>> anyway
> >>>>>>>>>> based
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>> content hash, so this does not hurt.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 02.07.25 18:49, Shahar Epstein wrote:
> >>>>>>>>>>>>>>>>> I support option 2 with proper automation & CI - the
> >>>> reasonings
> >>>>>>>>>>>>>> you've
> >>>>>>>>>>>>>>>>> shown for that make sense to me.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Shahar
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, Jul 2, 2025 at 3:36 PM Ash Berlin-Taylor <
> >>>> a...@apache.org
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hello everyone,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> As we work on finishing off the code-level separation of
> >>>> Task
> >>>>>>>> SDK
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> Core
> >>>>>>>>>>>>>>>>>> (scheduler etc) we have come across some situations
> where
> >>> we
> >>>>>>>> would
> >>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> share code between these.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> However it’s not as straight forward of “just put it in
> a
> >>>> common
> >>>>>>>>>>>>>> dist
> >>>>>>>>>>>>>>>> they
> >>>>>>>>>>>>>>>>>> both depend upon” because one of the goals of the Task
> SDK
> >>>>>>>>>>>>>> separation
> >>>>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>>>>> to have 100% complete version independence between the
> >>> two,
> >>>>>>>>>> ideally
> >>>>>>>>>>>>>>>> even if
> >>>>>>>>>>>>>>>>>> they are built into the same image and venv. Most of the
> >>>> reason
> >>>>>>>>>> why
> >>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>> isn’t straight forward comes down to backwards
> >>>> compatibility -
> >>>>>>>> if
> >>>>>>>>>> we
> >>>>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>>>> an change to the common/shared distribution
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> We’ve listed the options we have thought about in
> >>>>>>>>>>>>>>>>>> https://github.com/apache/airflow/issues/51545 (but
> that
> >>>> covers
> >>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>>> things that I don’t want to get in to in this discussion
> >>>> such as
> >>>>>>>>>>>>>>>> possibly
> >>>>>>>>>>>>>>>>>> separating operators and executors out of a single
> >>> provider
> >>>>>>>> dist.)
> >>>>>>>>>>>>>>>>>> To give a concrete example of some code I would like to
> >>>> share
> >>>
> https://github.com/apache/airflow/blob/84897570bf7e438afb157ba4700768ea74824295/airflow-core/src/airflow/_logging/structlog.py
> >>>>>>>>>>>>>>>>>> — logging config. Another thing we will want to share
> will
> >>>> be
> >>>>>>>> the
> >>>>>>>>>>>>>>>>>> AirflowConfigParser class from airflow.configuration
> (but
> >>>>>>>> notably:
> >>>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> parser class, _not_ the default config values, again,
> lets
> >>>> not
> >>>>>>>>>> dwell
> >>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> specifics of that)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> So to bring the options listed in the issue here for
> >>>> discussion,
> >>>>>>>>>>>>>>> broadly
> >>>>>>>>>>>>>>>>>> speaking there are two high-level approaches:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. A single shared distribution
> >>>>>>>>>>>>>>>>>> 2. No shared package and copy/duplicate code
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The advantage of Approach 1 is that we only have the
> code
> >>>> in one
> >>>>>>>>>>>>>>> place.
> >>>>>>>>>>>>>>>>>> However for me, at least in this specific case of
> Logging
> >>>> config
> >>>>>>>>>> or
> >>>>>>>>>>>>>>>>>> AirflowConfigParser class is that backwards
> compatibility
> >>> is
> >>>>>>>> much
> >>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>> harder.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The main advantage of Approach 2 is the the code is
> >>> released
> >>>>>>>>>>>>>>>> with/embedded
> >>>>>>>>>>>>>>>>>> in the dist (i.e. apache-airflow-task-sdk would contain
> >>> the
> >>>>>>>> right
> >>>>>>>>>>>>>>>> version
> >>>>>>>>>>>>>>>>>> of the logging config and ConfigParser etc). The
> downside
> >>> is
> >>>>>>>> that
> >>>>>>>>>>>>>>> either
> >>>>>>>>>>>>>>>>>> the code will need to be duplicated in the repo, or
> better
> >>>> yet
> >>>>>>>> it
> >>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>> live in a single place in the repo, but some tooling
> (TBD)
> >>>> will
> >>>>>>>>>>>>>>>>>> automatically handle the duplication, either at commit
> >>>> time, or
> >>>>>>>> my
> >>>>>>>>>>>>>>>>>> preference, at release time.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> For this kind of shared “utility” code I am very
> strongly
> >>>>>>>> leaning
> >>>>>>>>>>>>>>>> towards
> >>>>>>>>>>>>>>>>>> option 2 with automation, as otherwise I think the
> >>> backwards
> >>>>>>>>>>>>>>>> compatibility
> >>>>>>>>>>>>>>>>>> requirements would make it unworkable (very quickly over
> >>>> time
> >>>>>>>> the
> >>>>>>>>>>>>>>>>>> combinations we would have to test would just be
> >>>> unreasonable)
> >>>>>>>>>> and I
> >>>>>>>>>>>>>>>> don’t
> >>>>>>>>>>>>>>>>>> feel confident we can have things as stable as we need
> to
> >>>> really
> >>>>>>>>>>>>>>> deliver
> >>>>>>>>>>>>>>>>>> the version separation/independency I want to delivery
> >>> with
> >>>>>>>>>> AIP-72.
> >>>>>>>>>>>>>>>>>> So unless someone feels very strongly about this, I will
> >>>> come up
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>> draft PR for further discussion that will implement code
> >>>> sharing
> >>>>>>>>>> via
> >>>>>>>>>>>>>>>>>> “vendoring” it at build time. I have an idea of how I
> can
> >>>>>>>> achieve
> >>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>>>> we have a single version in the repo and it’ll work
> there,
> >>>> but
> >>>>>>>> at
> >>>>>>>>>>>>>>>> runtime
> >>>>>>>>>>>>>>>>>> we vendor it in to the shipped dist so it lives at
> >>> something
> >>>>>>>> like
> >>>>>>>>>>>>>>>>>> `airflow.sdk._vendor` etc.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In terms of repo layout, this likely means we would end
> up
> >>>> with:
> >>>>>>>>>>>>>>>>>> airflow-core/pyproject.toml
> >>>>>>>>>>>>>>>>>> airflow-core/src/
> >>>>>>>>>>>>>>>>>> airflow-core/tests/
> >>>>>>>>>>>>>>>>>> task-sdk/pyproject.toml
> >>>>>>>>>>>>>>>>>> task-sdk/src/
> >>>>>>>>>>>>>>>>>> task-sdk/tests/
> >>>>>>>>>>>>>>>>>> airflow-common/src
> >>>>>>>>>>>>>>>>>> airflow-common/tests/
> >>>>>>>>>>>>>>>>>> # Possibly no airflow-common/pyproject.toml, as deps
> would
> >>>> be
> >>>>>>>>>>>>>> included
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> the downstream projects. TBD.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thoughts and feedback welcomed.
> >>>> ---------------------------------------------------------------------
> >>>>>>>>>>>>>>>> To unsubscribe, e-mail:
> dev-unsubscr...@airflow.apache.org
> >>>>>>>>>>>>>>>> For additional commands, e-mail:
> >>> dev-h...@airflow.apache.org
> >>>>>>>>>>>>>>>>
> >>>>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> >>>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> >>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> >>>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> >>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> >>>>
> >>>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> For additional commands, e-mail: dev-h...@airflow.apache.org
>
>

Reply via email to