@Uranusjr would this help as a pilot in your AIP-60 code to parse and validate URIs for datasets?
Mit freundlichen Grüßen / Best regards Jens Scheffler Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | www.bosch.com Tel. +49 711 811-91508 | Mobil +49 160 90417410 | jens.scheff...@de.bosch.com Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000; Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert -----Original Message----- From: Jarek Potiuk <ja...@potiuk.com> Sent: Donnerstag, 22. Februar 2024 00:53 To: dev@airflow.apache.org Subject: Re: [DISCUSS] Common.util provider? Yep. It could work with symbolic links. Tested it and with flit - both wheel and sdist packaged code such symbolically linked file is dereferenced and copy of the file is added there. It could be a nice way of doing it. Maybe then worth trying next time if someone has a need? J On Thu, Feb 22, 2024 at 12:39 AM Scheffler Jens (XC-AS/EAE-ADA-T) <jens.scheff...@de.bosch.com.invalid> wrote: > >>> As of additional dependency complexity between providers actually > >>> the > additional dependency I think creates more problems than the benefit… > would be cool if there would be an option to „inline“ common code from > a single place but keep individual providers fully independent… > > >Well, we already do a lot of inlining, so if we think we should do > >more, > we have mechanisms for that. We have pre-commits and release commands > that do a lot of that. Pre commits are inlining scripts in > Dockerfiles, shortening PyPI readme . The providers __init__.py files > and changelogs and index documentation .rst (partially) are generated > at release documentation preparation time, pyproject.toml for > providers are generated from common templates at package building time > and so on and so on :). So we can do more of that and generate common > code, it's just a matter of adding pre-commits or breeze scripts. But > again "can't have and eat cake" - this has the drawback that there are > extra steps involved and even if it's automated it does add friction > when you have to regenerate the code every time you change it and when > you change it in another place than where you use it. > > Yes, also thought a moment about pre-commit. I#d be okay if we really > in-line and have a pre-commit aligning the redundancy of python in folders. > Might need to be an opt-in if only 10 of 85 providers are using common > stuff and if we change a common line we probably do not need to affect > all providers. > > As long as no Windows users trying to code for airflow (do we need to > consider?) would it also work to use symlinks? Git can cope with this, > I don't know if the python toolchain can de-reference a copy and are > not packaging a symlink? Would be worth a test... would save the > pre-commit and we even could selectively include common bla into > providers as needed :-D > > Mit freundlichen Grüßen / Best regards > > Jens Scheffler > > Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch GmbH | > Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | > www.bosch.com Tel. +49 711 811-91508 | Mobil +49 160 90417410 | > jens.scheff...@de.bosch.com > > Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000; > Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer; > Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. > Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. > Tanja Rückert > > -----Original Message----- > From: Jarek Potiuk <ja...@potiuk.com> > Sent: Mittwoch, 21. Februar 2024 21:18 > To: dev@airflow.apache.org > Subject: Re: [DISCUSS] Common.util provider? > > > if we have a common piece then we are locking all depending > > providers > (potentially) together if common code changes > > Yes, coupling in this case is the drawback of this solution. You can't > have cake and eat it too and in this case you trade DRY with coupling. > > > As of additional dependency complexity between providers actually > > the > additional dependency I think creates more problems than the benefit… > would be cool if there would be an option to „inline“ common code from > a single place but keep individual providers fully independent… > > Well, we already do a lot of inlining, so if we think we should do > more, we have mechanisms for that. We have pre-commits and release > commands that do a lot of that. Pre commits are inlining scripts in > Dockerfiles, shortening PyPI readme . The providers __init__.py files > and changelogs and index documentation .rst (partially) are generated > at release documentation preparation time, pyproject.toml for > providers are generated from common templates at package building time > and so on and so on :). So we can do more of that and generate common > code, it's just a matter of adding pre-commits or breeze scripts. But > again "can't have and eat cake" - this has the drawback that there are > extra steps involved and even if it's automated it does add friction > when you have to regenerate the code every time you change it and when > you change it in another place than where you use it. > > J. > > On Wed, Feb 21, 2024 at 9:02 PM Scheffler Jens (XC-AS/EAE-ADA-T) < > jens.scheff...@de.bosch.com.invalid> wrote: > > > Hi Jarek, > > > > At reviewing the PR from uranusjr for AIP-60 I also had the feeling > > that a lot of very similar code is repeated in all the providers. > > But during review yesterday I dropped the ides because if we have a > > common piece then we are locking all depending providers > > (potentially) together if common code changes. > > As of additional dependency complexity between providers actually > > the additional dependency I think creates more prblems than the > > benefit… would be cool if tehere would be an option to „inline“ > > common code from a single place but keep individual providers fully > > independent… > > > > Jens > > > > Sent from Outlook for > > iOS<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F% > > 2F > > aka.ms%2Fo0ukef&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7C98c88 > > 97 > > 195d944d483ab08dc331a49bb%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0 > > %7 > > C638441435197193656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ > > Ij > > oiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=n6gk9fNn > > WB SJOPYEgJ9WbriZ3H4id3RhLr16SguOuFA%3D&reserved=0> > > ________________________________ > > From: Jarek Potiuk <ja...@potiuk.com> > > Sent: Wednesday, February 21, 2024 5:42:20 PM > > To: dev@airflow.apache.org <dev@airflow.apache.org> > > Subject: [DISCUSS] Common.util provider? > > > > Hello everyone, > > > > How do we feel about introducing a common.util provider? > > > > I know it's not been the original idea behind providers, but - after > > testing common.sql and now also having common.io, seems like more > > and more we would like to extract some common code that we would > > like providers to use, but we refrain from it, because it will only > > be actually usable 6 months after we introduce some common code. > > > > However, if we introduce common.util, this problem is generally gone > > - at the expense of more complex maintenance and cross-provider > dependencies. > > We should be able to add a common util method and use it in a > > provider at the same time. > > > > Think Amazon provider using a new feature released in common.util > > >=1.2.0 and google provider >= 1.1.0. All manageable and we do it > > already for common.sql. We know how to do it, we know what to avoid, > > we know we cannot introduce backwards-incompatible changes, so we > > have to be very clear what is and what is not a public API there, We > > could rather easily add tests to prevent such backwards-incompatible > > changes. We even have a solution for chicken-egg providers where we > > need to release two providers at the same time if they depend on > > each other. Generally speaking it's quite workable but adds a bit of > > overhead. > > > > Examples that we could implement as "common.util": > > > > - common versioning check with cache - where multiple providers > > could reuse "do we have pendulum 2" > > - more complex - some date management features (we have a few like > > date_ranges/round_time). But there are many more. > > > > I generally do not love the common "util" approach. It has a > > tendency to become a bag of everything over time. but if we limit it > > to a set of small, fully decoupled modules where each module is > > independent - it's OK. And we already have it in "airflow.util" and > > we seem to be > doing well. > > > > WDYT? Is it worth it ? > > > > J. > > >