@Uranusjr would this help as a pilot in your AIP-60 code to parse and validate 
URIs for datasets?

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
www.bosch.com
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | jens.scheff...@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; 
Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus 
Forschner, 
Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert

-----Original Message-----
From: Jarek Potiuk <ja...@potiuk.com> 
Sent: Donnerstag, 22. Februar 2024 00:53
To: dev@airflow.apache.org
Subject: Re: [DISCUSS] Common.util provider?

Yep. It could work with symbolic links. Tested it and with flit - both wheel 
and sdist packaged code such symbolically linked file is dereferenced and copy 
of the file is added there. It could be a nice way of doing it.

Maybe then worth trying next time if someone has a need?

J

On Thu, Feb 22, 2024 at 12:39 AM Scheffler Jens (XC-AS/EAE-ADA-T) 
<jens.scheff...@de.bosch.com.invalid> wrote:

> >>> As of additional dependency complexity between providers actually 
> >>> the
> additional dependency I think creates more problems than the benefit… 
> would be cool if there would be an option to „inline“ common code from 
> a single place but keep individual providers fully independent…
>
> >Well, we already  do a lot of inlining, so if we think we should do 
> >more,
> we have mechanisms for that. We have  pre-commits and release commands 
> that do a lot of that. Pre commits are inlining scripts in 
> Dockerfiles, shortening PyPI readme . The providers __init__.py files 
> and changelogs and index documentation .rst (partially) are generated 
> at release documentation preparation time, pyproject.toml for 
> providers are generated from common templates at package building time 
> and so on and so on :). So we can do more of that and generate common 
> code, it's just a matter of adding pre-commits or breeze scripts. But 
> again "can't have and eat cake" - this has the drawback that there are 
> extra steps involved and even if it's automated it does add friction 
> when you have to regenerate the code every time you change it and when 
> you change it in another place than where you use it.
>
> Yes, also thought a moment about pre-commit. I#d be okay if we really 
> in-line and have a pre-commit aligning the redundancy of python in folders.
> Might need to be an opt-in if only 10 of 85 providers are using common 
> stuff and if we change a common line we probably do not need to affect 
> all providers.
>
> As long as no Windows users trying to code for airflow (do we need to
> consider?) would it also work to use symlinks? Git can cope with this, 
> I don't know if the python toolchain can de-reference a copy and are 
> not packaging a symlink? Would be worth a test... would save the 
> pre-commit and we even could selectively include common bla into 
> providers as needed :-D
>
> Mit freundlichen Grüßen / Best regards
>
> Jens Scheffler
>
> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch GmbH | 
> Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | 
> www.bosch.com Tel. +49 711 811-91508 | Mobil +49 160 90417410 | 
> jens.scheff...@de.bosch.com
>
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. 
> Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. 
> Tanja Rückert
>
> -----Original Message-----
> From: Jarek Potiuk <ja...@potiuk.com>
> Sent: Mittwoch, 21. Februar 2024 21:18
> To: dev@airflow.apache.org
> Subject: Re: [DISCUSS] Common.util provider?
>
> > if we have a common piece then we are locking all depending 
> > providers
> (potentially) together if common code changes
>
> Yes, coupling in this case is the drawback of this solution. You can't 
> have cake and eat it too and in this case you trade DRY with coupling.
>
> > As of additional dependency complexity between providers actually 
> > the
> additional dependency I think creates more problems than the benefit… 
> would be cool if there would be an option to „inline“ common code from 
> a single place but keep individual providers fully independent…
>
> Well, we already  do a lot of inlining, so if we think we should do 
> more, we have mechanisms for that. We have  pre-commits and release 
> commands that do a lot of that. Pre commits are inlining scripts in 
> Dockerfiles, shortening PyPI readme . The providers __init__.py files 
> and changelogs and index documentation .rst (partially) are generated 
> at release documentation preparation time, pyproject.toml for 
> providers are generated from common templates at package building time 
> and so on and so on :). So we can do more of that and generate common 
> code, it's just a matter of adding pre-commits or breeze scripts. But 
> again "can't have and eat cake" - this has the drawback that there are 
> extra steps involved and even if it's automated it does add friction 
> when you have to regenerate the code every time you change it and when 
> you change it in another place than where you use it.
>
> J.
>
> On Wed, Feb 21, 2024 at 9:02 PM Scheffler Jens (XC-AS/EAE-ADA-T) < 
> jens.scheff...@de.bosch.com.invalid> wrote:
>
> > Hi Jarek,
> >
> > At reviewing the PR from uranusjr for AIP-60 I also had the feeling 
> > that a lot of very similar code is repeated in all the providers. 
> > But during review yesterday I dropped the ides because if we have a 
> > common piece then we are locking all depending providers 
> > (potentially) together if common code changes.
> > As of additional dependency complexity between providers actually 
> > the additional dependency I think creates more prblems than the 
> > benefit… would be cool if tehere would be an option to „inline“ 
> > common code from a single place but keep individual providers fully 
> > independent…
> >
> > Jens
> >
> > Sent from Outlook for
> > iOS<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%
> > 2F
> > aka.ms%2Fo0ukef&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7C98c88
> > 97
> > 195d944d483ab08dc331a49bb%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0
> > %7 
> > C638441435197193656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ
> > Ij 
> > oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=n6gk9fNn
> > WB SJOPYEgJ9WbriZ3H4id3RhLr16SguOuFA%3D&reserved=0>
> > ________________________________
> > From: Jarek Potiuk <ja...@potiuk.com>
> > Sent: Wednesday, February 21, 2024 5:42:20 PM
> > To: dev@airflow.apache.org <dev@airflow.apache.org>
> > Subject: [DISCUSS] Common.util provider?
> >
> > Hello everyone,
> >
> > How do we feel about introducing a common.util provider?
> >
> > I know it's not been the original idea behind providers, but - after 
> > testing common.sql and now also having common.io, seems like more 
> > and more we would like to extract some common code that we would 
> > like providers to use, but we refrain from it, because it will only 
> > be actually usable 6 months after we introduce some common code.
> >
> > However, if we introduce common.util, this problem is generally gone 
> > - at the expense of more complex maintenance and cross-provider
> dependencies.
> > We should be able to add a common util method and use it in a 
> > provider at the same time.
> >
> > Think Amazon provider using a new feature released in common.util
> > >=1.2.0 and google provider >= 1.1.0. All manageable and we do it
> > already for common.sql. We know how to do it, we know what to avoid, 
> > we know we cannot introduce backwards-incompatible changes, so we 
> > have to be very clear what is and what is not a public API there, We 
> > could rather easily add tests to prevent such backwards-incompatible 
> > changes. We even have a solution for chicken-egg providers where we 
> > need to release two providers at the same time if they depend on 
> > each other. Generally speaking it's quite workable but adds a bit of 
> > overhead.
> >
> > Examples that we could implement as "common.util":
> >
> > - common versioning check with cache - where multiple providers 
> > could reuse "do we have pendulum 2"
> > - more complex - some date management features (we have a few like 
> > date_ranges/round_time). But there are many more.
> >
> > I generally do not love the common "util" approach. It has a 
> > tendency to become a bag of everything over time. but if we limit it 
> > to a set of small, fully decoupled modules where each module is 
> > independent - it's OK. And we already have it in "airflow.util" and 
> > we seem to be
> doing well.
> >
> > WDYT? Is it worth it ?
> >
> > J.
> >
>

Reply via email to