Re: [DISCUSS] Create community "Apache YuniKorn" executor ?

Jarek Potiuk Tue, 29 Oct 2024 07:06:44 -0700

This all looks really good - sounds like something that we could only do in
K8S executor and likely even make it compatible with Airflow 2 and release
independently


On Tue, Oct 29, 2024 at 1:30 PM Amogh Desai <amoghdesai....@gmail.com>
wrote:

> > As I understand what It means - If I read it correctly, that it's mostly
> a
> deployment issue - we don't even have to have YuniKorn Executor - we can
> use K8S Executor, and it will work out of the box, with scheduling
> controlled by YuniKorn, but then we need to find a way to configure
> behaviour of tasks and dags (likely via annotations of pods maybe?). That
> would mean that it's mostly a documentation on "How I can leverage YuniKorn
> with Airflow" + maybe a helm chart modification to install YuniKorn as an
> option?
>
> And then likely we need to add a little bit of metadata and some mapping of
> "task" or "dag" or "task group" properties to open-up more capabilities of
> YuniKorn Scheduling ?
>
> Do I understand correctly?
>
> You mostly summed it up. But a few things.
> Yes, we can open up Yunikorn to schedule Airflow workloads just by doing
> basically
> nothing or at most very little manual work.
>
> But to really enable Yunikorn in full power, we will have to make some
> changes to the
> Airflow codebase. A few things at the top of my head:
> The admission controller will take care of the applicationId and scheduler
> name etc, but from
> an initial read, if we want things like - "schedule dags to a certain queue
> only" or something of
> that sort, we will need some labels to be injected or even a level above,
> get the KPO to add
> some labels etc, like a queue.
> OR
> even if we can specify the queue for every operator by extending the
> BaseOperator, that would be cool
> too.
>
> I personally think if we could extend the KubernetesExecutor to
> YunikornExecutor (naming doesn't matter
> to me), we can handle things like installing Yunikorn along with Airflow by
> making changes for helm charts,
> make it come up with the scheduler, admission controller, etc. We will able
> to make code changes for Airflow
> by controlling the internal logic with the executor type instead of lending
> it all the way to the end user (I
> mean options like the label injection, labelling all the tasks of a group
> as an application, to adhere to Jarek's
> thought).
>
> Manikandan, feel free to add anything more from the Yunikorn side in case I
> have misinterpreted or
> just generally missed :)
>
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Tue, Oct 29, 2024 at 1:28 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > This is cool.
> >
> > As I understand what It means - If I read it correctly, that it's mostly
> a
> > deployment issue - we don't even have to have YuniKorn Executor - we can
> > use K8S Executor, and it will work out of the box, with scheduling
> > controlled by YuniKorn, but then we need to find a way to configure
> > behaviour of tasks and dags (likely via annotations of pods maybe?). That
> > would mean that it's mostly a documentation on "How I can leverage
> YuniKorn
> > with Airflow" + maybe a helm chart modification to install YuniKorn as an
> > option?
> >
> > And then likely we need to add a little bit of metadata and some mapping
> of
> > "task" or "dag" or "task group" properties to open-up more capabilities
> of
> > YuniKorn Scheduling ?
> >
> > Do I understand correctly?
> >
> > > 1. Yunikorn treats applications at the DAG level not at the task level,
> > > which is great. Due to this, we can try to leverag
> > > gang scheduling abilities of Yunikorn.
> >
> > This is great. I was wondering if we could also allow the application on
> > the "Task Group" level. I find it is a really interesting feature to be
> > able to treat a "Task Group" as an entity that we could treat as
> > "application" - this way you could treat the "Task Group" as "schedulable
> > entity" and for example set pre-emption properties for all tasks in the
> > same task group. Or Gang scheduling for the task group ("Only schedule
> > tasks in the task group when there is enough resources for the whole task
> > group". Or - and this is something that I think as a "holy grail" of
> > scheduling in the context of optimisation of machine learning workflows:
> > "Make sure that all the tasks in a group are scheduled on the the same
> node
> > and use the same local hardware resources" + if any of them fail, retry
> the
> > whole group - also on the same instance (I think this is partially
> possible
> > with some node affinity setup - but I would love if we should be able to
> > set a property on a task group effectively meaning ("Execute all tasks in
> > the group on the same hardware") - so a bit higher abstraction, and have
> > YuniKorn handle all the pre-emption and optimisations of scheduling for
> > that.
> >
> > > 2. With the admission controller running, even the older DAGs will be
> > able
> > > to benefit from the Yunikorn scheduling ablities
> > >
> > > without the need to make changes to the DAGs. This means that the same
> > DAG
> > > will run with default scheduler (K8s default)
> >
> > > as well as Yunikorn if need be!
> >
> > Fantastic!
> >
> > 3. As Mani mentioned, preemption capabilities can be explored due to this
> > as well.
> >
> > I am happy to work on this effort and looking forward to it.
> >
> > > Yeah that would be cool - also see above, I think if we will be able to
> > have some "light touch" integration with Yunikorn, where we could handle
> > "Task Group" as schedulable entity + have some higher level abstractions
> /
> > properties of it that would map into some "scheduling behaviour" -
> > preemption/gang scheduling and document it, that would be great and easy
> > way of expanding Airflow capabilities - especially for ML workflows.
> >
> > J.
> >
> >
> > On Tue, Oct 29, 2024 at 8:10 AM Amogh Desai <amoghdesai....@gmail.com>
> > wrote:
> >
> > > Building upon the POC done by Manikandan, I tried my hands at an
> > experiment
> > > too.
> > >
> > > I wanted to mainly experiment with the Yunikorn admission controller,
> > with
> > > an aim to make
> > >
> > > no changes to my older DAGs.
> > >
> > >
> > > Deployed a setup that looks like this:
> > >
> > > - Deployed Yunikorn in a kind cluster with the default configurations.
> > The
> > > default configurations launches the
> > >
> > > Yunikorn scheduler as well as an admission controller which watches
> for a
> > > `yunikorn-configs` configmap that
> > >
> > > can define queues, partitions, placement rules etc.
> > >
> > > - Deployed Airflow using helm charts in the same kind cluster while
> > > specifying the executor as KubernetesExecutor.
> > >
> > >
> > >
> > > Wanted to test out if Yunikorn can take over the scheduling of Airflow
> > > workers. Created some queues using this
> > >
> > > config present here:
> > >
> > >
> >
> https://github.com/apache/yunikorn-k8shim/blob/master/deployments/examples/namespace/queues.yaml
> > >
> > >
> > > Tried running the Airflow K8s executor dag
> > > <
> > >
> >
> https://github.com/apache/airflow/blob/main/airflow/example_dags/example_kubernetes_executor.py
> > > >
> > > without
> > > any changes to the DAG.
> > >
> > > I was able to run the DAG successfully.
> > >
> > >
> > > Results
> > >
> > > 1. The task pods get scheduled by Yunikorn instead of the default K8s
> > > scheduler
> > >
> > >
> > > 2. I was able to observe a single application run for the Airflow DAG
> in
> > > the Yunikorn UI.
> > >
> > >
> > > Observations
> > >
> > > 1. Yunikorn treats applications at the DAG level not at the task level,
> > > which is great. Due to this, we can try to leverage
> > >
> > > gang scheduling abilities of Yunikorn.
> > >
> > > 2. With the admission controller running, even the older DAGs will be
> > able
> > > to benefit from the Yunikorn scheduling ablities
> > >
> > > without the need to make changes to the DAGs. This means that the same
> > DAG
> > > will run with default scheduler (K8s default)
> > >
> > > as well as Yunikorn if need be!
> > >
> > > 3. As Mani mentioned, preemption capabilities can be explored due to
> this
> > > as well.
> > >
> > >
> > > I am happy to work on this effort and looking forward to it.
> > >
> > >
> > >
> > > Thanks & Regards,
> > > Amogh Desai
> > >
> > >
> > > On Tue, Oct 15, 2024 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > Hello here,
> > > >
> > > > *Tl;DR; I would love to start discussion about creating (for Airflow
> > 3.x
> > > -
> > > > it does not have to be Airflow 3.0) a new community executor based on
> > > > YuniKorn*
> > > >
> > > > You might remember my point "replacing Celery Executor" when I raised
> > the
> > > > Airflow 3 question. I never actually "meant" to replace (and remove)
> > > Celery
> > > > Executor, but I was more in a quest to see if we have a viable
> > > alternative.
> > > >
> > > > And I think we have one with Apache Yunicorn.
> > > https://yunikorn.apache.org/
> > > >
> > > > While it is not a direct replacement (so I'd say it should be an
> > > additional
> > > > executor), I think Yunikorn can provide us with a number of features
> > that
> > > > we currently cannot give to our users and from the discussions I had
> > and
> > > > talk I saw at the Community Over Code in Denver, I believe it might
> be
> > > > something that might make Airflow also more capable especially in the
> > > > "optimization wars" context that I wrote about in
> > > > https://lists.apache.org/thread/1mp6jcfvx67zd3jjt9w2hlj0c5ysbh8r
> > > >
> > > > It seems like quite a good fit for the "Inference" use case that we
> > want
> > > to
> > > > support for Airflow 3.
> > > >
> > > > At the Community Over Code I attended a talk (and had quite nice
> > > follow-up
> > > > discussion) from Apple engineers - named: "Maximizing GPU
> Utilization:
> > > > Apache YuniKorn Preemption" and had a very long discussion with
> > Cloudera
> > > > people who are using YuniKorn for years to optimize their workloads.
> > > >
> > > > The presentation is not recorded, but I will try to get slides and
> send
> > > it
> > > > your way.
> > > >
> > > > I think we should take a close look at it  - because it seems to
> save a
> > > ton
> > > > of implementation effort for the Apple team running Batch inference
> for
> > > > their multi-tenant internal environment - which I think is precisely
> > what
> > > > you want to do.
> > > >
> > > > YuniKorn (https://yunikorn.apache.org/) is an "app-aware" scheduler
> > that
> > > > has a number of queue / capacity management models, policies that
> allow
> > > > controlling various applications - competing for GPUs from a common
> > pool.
> > > >
> > > > They mention things like:
> > > >
> > > > * Gang Scheduling / with gang scheduling preemption where there are
> > > > workloads requiring minimum number of workers
> > > > * Supports Latency sensitive workloads
> > > > * Resource quota management - things like priorities of execution
> > > > * YuniKorn preemption - with guaranteed capacity and preemption when
> > > needed
> > > > - which improves the utilisation
> > > > * Preemption that minimizes preemption cost (Pod level preemption
> > rather
> > > > than application level preemption) - very customizable preemption
> with
> > > > opt-in/opt-out, queues, resource weights, fencing, supporting
> fifo/lifo
> > > > sorting etc.
> > > > * Runs in Cloud and on-premise
> > > >
> > > > The talk described quite a few scenarios of preemption/utilization/
> > > > guaranteed resources etc. They also outlined on what YuniKorn works
> on
> > > new
> > > > features (intra-queue preemption etc.) and what future things can be
> > > done.
> > > >
> > > >
> > > > Coincidentally - Amogh Desai with a friend submitted a talk for
> Airflow
> > > > Summit:
> > > >
> > > > "A Step Towards Multi-Tenant Airflow Using Apache YuniKorn"
> > > >
> > > > Which did not make it to the Summit (other talk of Amogh did) - but I
> > > think
> > > > back then we have not realized about the potential of utilising
> > YuniKorn
> > > to
> > > > optimize workflows managed by Airflow.
> > > >
> > > > But we seem to have people in the community who know more about
> > YuniKorn
> > > <>
> > > > Airflow relation (Amogh :) ) and could probably comment and add some
> > > "from
> > > > the trenches" experience to the discussion.
> > > >
> > > > Here is the description of the talk that Amoghs submitted:
> > > >
> > > > Multi-tenant Airflow is hard and there have been novel approaches in
> > the
> > > > recent past to converge this gap. A key obstacle in multi-tenant
> > Airflow
> > > is
> > > > the management of cluster resources. This is crucial to avoid one
> > > malformed
> > > > workload from hijacking an entire cluster. It is also vital to
> restrict
> > > > users and groups from monopolizing resources in a shared cluster
> using
> > > > their workloads.
> > > >
> > > > To tackle these challenges, we turn to Apache YuniKorn, a K8s
> scheduler
> > > > catering all kinds of workloads. We leverage YuniKorn’s hierarchical
> > > queues
> > > > in conjunction with resource quotas to establish multi-tenancy at
> both
> > > the
> > > > shared namespace level and within individual namespaces where Airflow
> > is
> > > > deployed.
> > > >
> > > > YuniKorn also introduces Airflow to a new dimension of preemption.
> Now,
> > > > Airflow workers can preempt resources from lower-priority jobs,
> > ensuring
> > > > critical schedules in our data pipelines are met without compromise.
> > > >
> > > > Join us for a discussion on integrating Airflow with YuniKorn,
> > unraveling
> > > > solutions to these multi-tenancy challenges. We will also share our
> > past
> > > > experiences while scaling Airflow and the steps we have taken to
> handle
> > > > real world production challenges in equitable multi-tenant K8s
> > clusters.
> > > >
> > > > I would love to hear what you think about it. I know we are deep into
> > > > Airflow 3.0 implementation - but that one can be
> discussed/implemented
> > > > independently and maybe it's a good idea to start doing it earlier
> than
> > > > later if we see that it has good potential.
> > > >
> > > > J.
> > > >
> > >
> >
>

Re: [DISCUSS] Create community "Apache YuniKorn" executor ?

Reply via email to