Hello all,

I would like to propose adding a new provider for the Ray[1] service to the
Airflow providers.

Ray is an open source framework to build and scale ML and Python
applications. In the current time in the Google provider we have two
services which can work with Ray are GKE and Vertex AI. But, it is
important to know that operators for working with Ray on GKE and on
VertexAI are only about creating a Ray cluster on Google Cloud
infrastructure. For starting a Ray application we need to start a Ray Job
on the Ray Cluster. For doing it users need to use the Ray’s Python SDK

Knowing all of this I can suggest creating a new provider for Ray itself
with operators which can manage Ray’s Job. Here[2] is the code for Client
for working with Jobs. We need a new provider, because Ray is not a Google
service. And if we want to provide for users ability to submit jobs to
clusters then we need to create new operators and put them to the new
provider. Also, Ray can work with clusters which are deployed on AWS, Azure
and more. It means that operators from this provider can be used in
combination with operators from amazon and microsoft providers.

I have started the implementation. And I will be glad to hear from all of
you any feedback about my proposal


[1] https://docs.ray.io/en/latest/index.html
[2]
https://github.com/ray-project/ray/blob/master/python/ray/dashboard/modules/job/sdk.py#L35

Reply via email to