[DISCUSS] Create community "Apache YuniKorn" executor ?

Jarek Potiuk Tue, 15 Oct 2024 03:56:16 -0700

Hello here,

*Tl;DR; I would love to start discussion about creating (for Airflow 3.x -
it does not have to be Airflow 3.0) a new community executor based on
YuniKorn*


You might remember my point "replacing Celery Executor" when I raised the
Airflow 3 question. I never actually "meant" to replace (and remove) Celery
Executor, but I was more in a quest to see if we have a viable alternative.

And I think we have one with Apache Yunicorn. https://yunikorn.apache.org/

While it is not a direct replacement (so I'd say it should be an additional
executor), I think Yunikorn can provide us with a number of features that
we currently cannot give to our users and from the discussions I had and
talk I saw at the Community Over Code in Denver, I believe it might be
something that might make Airflow also more capable especially in the
"optimization wars" context that I wrote about in
https://lists.apache.org/thread/1mp6jcfvx67zd3jjt9w2hlj0c5ysbh8r

It seems like quite a good fit for the "Inference" use case that we want to
support for Airflow 3.

At the Community Over Code I attended a talk (and had quite nice follow-up
discussion) from Apple engineers - named: "Maximizing GPU Utilization:
Apache YuniKorn Preemption" and had a very long discussion with Cloudera
people who are using YuniKorn for years to optimize their workloads.

The presentation is not recorded, but I will try to get slides and send it
your way.

I think we should take a close look at it  - because it seems to save a ton
of implementation effort for the Apple team running Batch inference for
their multi-tenant internal environment - which I think is precisely what
you want to do.

YuniKorn (https://yunikorn.apache.org/) is an "app-aware" scheduler that
has a number of queue / capacity management models, policies that allow
controlling various applications - competing for GPUs from a common pool.

They mention things like:

* Gang Scheduling / with gang scheduling preemption where there are
workloads requiring minimum number of workers
* Supports Latency sensitive workloads
* Resource quota management - things like priorities of execution
* YuniKorn preemption - with guaranteed capacity and preemption when needed
- which improves the utilisation
* Preemption that minimizes preemption cost (Pod level preemption rather
than application level preemption) - very customizable preemption with
opt-in/opt-out, queues, resource weights, fencing, supporting fifo/lifo
sorting etc.
* Runs in Cloud and on-premise

The talk described quite a few scenarios of preemption/utilization/
guaranteed resources etc. They also outlined on what YuniKorn works on new
features (intra-queue preemption etc.) and what future things can be done.


Coincidentally - Amogh Desai with a friend submitted a talk for Airflow
Summit:

"A Step Towards Multi-Tenant Airflow Using Apache YuniKorn"

Which did not make it to the Summit (other talk of Amogh did) - but I think
back then we have not realized about the potential of utilising YuniKorn to
optimize workflows managed by Airflow.

But we seem to have people in the community who know more about YuniKorn <>
Airflow relation (Amogh :) ) and could probably comment and add some "from
the trenches" experience to the discussion.

Here is the description of the talk that Amoghs submitted:

Multi-tenant Airflow is hard and there have been novel approaches in the
recent past to converge this gap. A key obstacle in multi-tenant Airflow is
the management of cluster resources. This is crucial to avoid one malformed
workload from hijacking an entire cluster. It is also vital to restrict
users and groups from monopolizing resources in a shared cluster using
their workloads.

To tackle these challenges, we turn to Apache YuniKorn, a K8s scheduler
catering all kinds of workloads. We leverage YuniKorn’s hierarchical queues
in conjunction with resource quotas to establish multi-tenancy at both the
shared namespace level and within individual namespaces where Airflow is
deployed.

YuniKorn also introduces Airflow to a new dimension of preemption. Now,
Airflow workers can preempt resources from lower-priority jobs, ensuring
critical schedules in our data pipelines are met without compromise.

Join us for a discussion on integrating Airflow with YuniKorn, unraveling
solutions to these multi-tenancy challenges. We will also share our past
experiences while scaling Airflow and the steps we have taken to handle
real world production challenges in equitable multi-tenant K8s clusters.

I would love to hear what you think about it. I know we are deep into
Airflow 3.0 implementation - but that one can be discussed/implemented
independently and maybe it's a good idea to start doing it earlier than
later if we see that it has good potential.

J.

[DISCUSS] Create community "Apache YuniKorn" executor ?

Reply via email to