We do see a not insignificant number of our customers using Ray & Airflow together on Astro and Astronomer-Software, so there is definitely interest and I believe this could be really valuable. https://github.com/astronomer/astro-provider-ray/ was created for validation purposes, but we were not able to significantly invest due to priority changes. I'd love to see an official provider within the project -- checking pypistats quickly, there were more than 13M downloads last month for Ray.
On Tue, May 27, 2025 at 3:30 PM Jens Scheffler <j_scheff...@gmx.de.invalid> wrote: > Hi, > > Thanks for the proposal. I assume you have read the > > https://github.com/apache/airflow/blob/main/PROVIDERS.rst#accepting-new-community-providers > docs? > > By accident I had also a (not maturing) discussion about integration of > Ray as cluster backend into Airflow workflows. But I am not sure how > common the demand is. Are there any voices? > > Have you considered just providing the operators separate and link them > in the ecosystem? > > Have you seen that there way a few years ago a provider being made in > Github, but seems this was not maintained for a while: > https://github.com/anyscale/airflow-provider-ray As well as > https://github.com/astronomer/astro-provider-ray/ > > Jens > > On 27.05.25 16:11, Maksim Yermakou wrote: > > Hello all, > > > > I would like to propose adding a new provider for the Ray[1] service to > the > > Airflow providers. > > > > Ray is an open source framework to build and scale ML and Python > > applications. In the current time in the Google provider we have two > > services which can work with Ray are GKE and Vertex AI. But, it is > > important to know that operators for working with Ray on GKE and on > > VertexAI are only about creating a Ray cluster on Google Cloud > > infrastructure. For starting a Ray application we need to start a Ray Job > > on the Ray Cluster. For doing it users need to use the Ray’s Python SDK > > > > Knowing all of this I can suggest creating a new provider for Ray itself > > with operators which can manage Ray’s Job. Here[2] is the code for Client > > for working with Jobs. We need a new provider, because Ray is not a > Google > > service. And if we want to provide for users ability to submit jobs to > > clusters then we need to create new operators and put them to the new > > provider. Also, Ray can work with clusters which are deployed on AWS, > Azure > > and more. It means that operators from this provider can be used in > > combination with operators from amazon and microsoft providers. > > > > I have started the implementation. And I will be glad to hear from all of > > you any feedback about my proposal > > > > > > [1] https://docs.ray.io/en/latest/index.html > > [2] > > > https://github.com/ray-project/ray/blob/master/python/ray/dashboard/modules/job/sdk.py#L35 > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >