And another - far more important reason (and reason why we have common.sql) - we could VERY LIKELY (maybe Maciej and Kacper could comment on that) - we could have equivalent of column-level lineage implemented once for all the engines - by adding "common.dataframe" open-lineage information.
On Tue, Jun 25, 2024 at 4:46 AM Jarek Potiuk <ja...@potiuk.com> wrote: > That is a very good question and I forgot to mention it. The main reason > is the same as in common.io - we could make it work with our standard > "Hook/Connection" framework so that you could get > authentication information from Airflow Connections, plugging in the > Secrets/ DB Connection information. > > So basically that would be a glue between Airflow configuration of > authentication and Ibis. > > J. > > > On Tue, Jun 25, 2024 at 12:45 AM Daniel Standish > <daniel.stand...@astronomer.io.invalid> wrote: > >> There might be a good case for "why ibis", but why should airflow wrap >> ibis? Why do we need a common dataframe library? Is ibis not "that" >> already? >> >> >> >> >> >> >> On Mon, Jun 24, 2024 at 3:31 PM Kaxil Naik <kaxiln...@gmail.com> wrote: >> >> > Yeah, the other option is to include it in the common.sql package since >> > they are related. But I am okay with the common.dataframe, too. >> > >> > >> > >> > On Mon, 24 Jun 2024 at 20:04, Jarek Potiuk <ja...@potiuk.com> wrote: >> > >> > > Hello here, >> > > >> > > At Pycon US earlier this year I had a number of interesting >> conversations >> > > and one of the - very interesting - conversations I had was with the >> Ibis >> > > team and I thought maybe we should consider releasing >> "common.dataframe" >> > > provider for Airflow - following up after "common.sql" and "common.io >> ". >> > > >> > > Ibis is gaining a lot of popularity recently and it might be at >> > > more-or-less the same "place" as fsspec when Bolke added "common.io". >> > Plus >> > > if airflow adds it as a community provider, it might also bring Ibis' >> > > popularity up. >> > > >> > > In short - Ibis is a "Portable Python dataframe library". It becomes >> more >> > > and more popular and it not only serves 20+ dataframe backends with >> the >> > > same, portable API, but also allows to mix SQL with dataframes and few >> > more >> > > things. Some time ago there were some ideas that we could add >> > "SQLAlchemy" >> > > as an additional "common" interface in "common.sql" - but actually it >> > seems >> > > that Ibis provides a much better abstraction that unifies SQL and >> > Dataframe >> > > approach nicely - way better suited for the "data science" world of >> > > Airflow. >> > > >> > > You can see very nice overview "why Ibis" here: >> > > https://ibis-project.org/why >> > > - and I think it would be pretty natural thing to add on top of >> > > "common.sql" and "common.io" - following "Airflow As a Platform" >> mantra. >> > > >> > > WDYT? >> > > >> > > J. >> > > >> > >> >