+1 from me I think the dashboard idea is great! On Thu, Oct 19, 2023 at 7:05 PM Andrey Anshin <andrey.ans...@taragol.is> wrote:
> Because 4 out 5 new providers have a draft PR I would like to raise a > question about which related to all new providers. Just to avoid the same > question in all PRs. > > Do we actually want to make new operators kindish of like "PythonOperator"? > Maybe I miss some important thing and can't see why it would work better > rather than run hooks methods inside of PythonOperator / TaskFlow? > > For the reference Reference: > Add Cohere Provider: > https://github.com/apache/airflow/pull/34921#discussion_r1358525838 > Enable pgvector support for Postgres provider: > https://github.com/apache/airflow/pull/34891#discussion_r1362910782 > Add OpenAI Provider: > https://github.com/apache/airflow/pull/35023#discussion_r1365235167 > Add Weaviate Provider: > https://github.com/apache/airflow/pull/35060/files#r1365765741 > > ---- > Best Wishes > *Andrey Anshin* > > > > On Tue, 17 Oct 2023 at 22:42, Kaxil Naik <kaxiln...@apache.org> wrote: > > > Hey Everyone, > > > > As a follow-up to my Keynote talk, Building and deploying LLM > applications > > with Apache Airflow <https://www.youtube.com/watch?v=mgA6m3ggKhs&t=4s>, > I > > am formally proposing the addition of these 5 providers to the Apache > > Airflow repo: > > > > - > > > > PgVector <https://github.com/pgvector/pgvector> > > - > > > > Weaviate <https://weaviate.io/> > > - > > > > Pinecone <https://www.pinecone.io/> > > - > > > > OpenAI <https://openai.com/> > > - > > > > Cohere <https://cohere.com/> > > > > > > Advancements in LLMs are moving at a rapid pace & transforming the way we > > work and our industry. Although LLMs are simple to use in prototyping, > > using LLM for enterprise applications and for production still presents a > > lot of challenges. These > > < > > > https://speakerdeck.com/kaxil/building-and-deploying-llm-applications-with-apache-airflow?slide=8 > > > > > are some of the same problems that we tackle in Data Engineering, and > > Airflow is a natural fit for them. > > > > We at Astronomer would like to add first-class support for the popular > LLMs > > (OpenAI & Cohere) and vector DBs (PgVector, Weaviate & Pinecone) so that > > Data Scientists and ML engineers can utilize them natively with > easy-to-use > > Operator & Hook abstractions while providing a native (and > > Production-ready) approach for Authentication, retries, logging etc. > > > > We also think this is vital for the Apache Airflow project as we, the > > project, embrace the LLM tide and continue to be a great example of > > balancing innovation and maintaining backward-compatibility. > > > > The first versions of these providers will enable building one of the > most > > common use cases of LLMs i.e. Question and Answering / Chatbots using > > Retrieval-augmented generation (RAG) done with the help of embeddings. > > > > Everyone is welcome and encouraged to contribute once the PRs are merged. > > Astronomer is committed to maintaining these providers in the Airflow > repo, > > including reviewing PRs, maintaining code quality, testing and keeping > the > > APIs up-to-date. > > > > Note: PgVector <https://github.com/pgvector/pgvector> is an open-source > > project, so we don’t need a formal vote for it as per our guidelines > > < > > > https://github.com/apache/airflow/blob/main/PROVIDERS.rst#accepting-new-community-providers > > >. > > So please consider this email as seeking a Lazy Consensus for it. > > > > I will open up a VOTING thread after discussing this for a few days. > > > > Thanks. > > > > Regards, > > > > Kaxil > > >