Re: [DISCUSS] Extract core autoscaling algorithm as new SubModule in flink-kubernetes-operator

Rui Fan Sun, 19 Feb 2023 18:36:15 -0800

Hi Gyula, Samrat and Shammon,

My team is also looking forward to autoscaler is compatible with yarn.


Currently, all of our flink jobs are running on yarn. And autoscaler is
a great feature for flink users, it can greatly simplify the process of
tuning parallelism.

If the autoscaler supports yarn, I propose to divide it into two stages:
1. It only collects and evaluates scaling related performance metrics
 but does not trigger any job upgrades.
2. Support for automatic upgrades of yarn jobs.

Also, I also hope to join it, and improve it together.

And very happy Gyula can help with the review.

Best,
Rui Fan

On Mon, Feb 20, 2023 at 8:56 AM Shammon FY <zjur...@gmail.com> wrote:

> Hi Samrat
>
> My team is also looking at this piece. After you give your proposal, we
> also hope to join it with you if possible. I hope we can improve this
> together for use in our production too, thanks :)
>
> Best,
> Shammon
>
> On Fri, Feb 17, 2023 at 9:27 PM Samrat Deb <decordea...@gmail.com> wrote:
>
> > @Gyula
> > Thank you
> > We will work on this and try to come up with an approach.
> >
> >
> >
> >
> > On Fri, Feb 17, 2023 at 6:12 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
> >
> > > In case you guys feel strongly about this I suggest you try to fork the
> > > autoscaler implementation and make a version that works with both the
> > > Kubernetes operator and YARN.
> > > If your solution is generic and works well, we can discuss the way
> > forward.
> > >
> > > Unfortunately me or my team don't really have the resources to assist
> you
> > > with the YARN effort as we are mostly invested in Kubernetes but of
> > course
> > > we are happy to review your work.
> > >
> > > Gyula
> > >
> > >
> > > On Fri, Feb 17, 2023 at 1:09 PM Prabhu Joseph <
> > prabhujose.ga...@gmail.com>
> > > wrote:
> > >
> > > > @Gyula
> > > >
> > > > >> It is easier to make the operator work with jobs running in
> > different
> > > > types of clusters than to take the
> > > > autoscaler module itself and plug that in somewhere else.
> > > >
> > > > Our (part of Samrat's team) main problem is to leverage the
> AutoScaler
> > > > Recommendation Engine part of Flink-Kubernetes-Operator for our Flink
> > > jobs
> > > > running on YARN.
> > > > Currently, it is not feasible as the autoscaler module is tightly
> > coupled
> > > > with the operator. We agree that the operator serves the two core
> > > > requirements, but the operator itself
> > > > cannot be used for Flink jobs running on YARN. Those core
> requirements
> > > are
> > > > solved through other mechanisms in the case of YARN. But the main
> > problem
> > > > for us is *how to*
> > > > *use the AutoScaler Recommendation Engine for Flink Jobs on YARN.*
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 17, 2023 at 6:34 AM Shammon FY <zjur...@gmail.com>
> wrote:
> > > >
> > > > > Hi Gyula, Samrat
> > > > >
> > > > > Thanks for your input and I totally agree with you that it's really
> > big
> > > > > work. As @Samrat mentioned above, I think it's not a short way to
> > make
> > > > the
> > > > > autoscaler completely independent too. But I still find some
> valuable
> > > > > points for the `completely independent autoscaler`, and I think
> this
> > > may
> > > > be
> > > > > the goal we need to achieve in the future.
> > > > >
> > > > > 1. A large k8s cluster may manage thousands of machines, and users
> > may
> > > > run
> > > > > tens of thousands flink jobs in one k8s cluster. If the autoscaler
> > > > manages
> > > > > all these jobs, the autoscaler should be horizontal expansion.
> > > > >
> > > > > 2. As you mentioned, "execute the job stateful upgrades safely" is
> > > > indeed a
> > > > > complexity work, but I think we should decouple it from k8s
> operator
> > > > >
> > > > > a) In addition to k8s, there may be some other resource management
> > > > >
> > > > > b) Flink may support more scaler operations by REST API, such as
> > > FLIP-291
> > > > > [1]
> > > > >
> > > > > c) In our production environment, there's a 'Job Submission
> Gateway'
> > > > which
> > > > > stores job info and config, monitors the status of running jobs.
> > After
> > > > the
> > > > > autoscaler upgrades the job, it must update the config in Gateway
> and
> > > > users
> > > > > can restart his job with the updated config to avoid resource
> > conflict.
> > > > > Under these circumstances, the autoscaler sending upgrade requests
> to
> > > the
> > > > > gateway may be a good choice.
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management
> > > > >
> > > > >
> > > > > Best,
> > > > > Shammon
> > > > >
> > > > >
> > > > > On Thu, Feb 16, 2023 at 11:03 PM Gyula Fóra <gyula.f...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > @Shammon , Samrat:
> > > > > >
> > > > > > I appreciate the enthusiasm and I wish this was only a matter of
> > > > > intention
> > > > > > but making the autoscaler work without the operator may be a
> pretty
> > > big
> > > > > > task.
> > > > > > You must not forget 2 core requirements here.
> > > > > >
> > > > > > 1. The autoscaler logic itself has to run somewhere (in this case
> > on
> > > > k8s
> > > > > > within the operator)S
> > > > > > 2. Something has to execute the job stateful upgrades safely
> based
> > on
> > > > the
> > > > > > scaling decisions (in this case the operator does that).
> > > > > >
> > > > > > 1. Can be solved almost anywhere easily however you need
> resiliency
> > > etc
> > > > > for
> > > > > > this to be a prod application, 2. is the really tricky part. The
> > > > operator
> > > > > > was actually built to execute job upgrades, if you look at the
> code
> > > you
> > > > > > will appreciate the complexity of the task.
> > > > > >
> > > > > > As I said in the earlier thread. It is easier to make the
> operator
> > > work
> > > > > > with jobs running in different types of clusters than to take the
> > > > > > autoscaler module itself and plug that in somewhere else.
> > > > > >
> > > > > > Gyula
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 16, 2023 at 3:12 PM Samrat Deb <
> decordea...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shammon,
> > > > > > >
> > > > > > > Thank you for your input, completely aligned with you.
> > > > > > >
> > > > > > > We are fine with either of the options ,
> > > > > > >
> > > > > > > but IMO, to start with it will be easy to have it in the
> > > > > > > flink-kubernetes-operator as a module instead of a separate
> repo
> > > > which
> > > > > > > requires additional effort.
> > > > > > >
> > > > > > > Given that we would be incrementally working on making an
> > > autoscaling
> > > > > > > recommendation framework generic enough,
> > > > > > >
> > > > > > > Once it reaches a point where the community feels it needs to
> be
> > > > moved
> > > > > > to a
> > > > > > > separate repo we can take a call.
> > > > > > >
> > > > > > > Bests,
> > > > > > >
> > > > > > > Samrat
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Feb 16, 2023 at 7:37 PM Samrat Deb <
> > decordea...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Max ,
> > > > > > > > If you are fine and aligned with the same thought , since
> this
> > is
> > > > > going
> > > > > > > to
> > > > > > > > be very useful to us, we are ready to help / contribute
> > > additional
> > > > > work
> > > > > > > > required.
> > > > > > > >
> > > > > > > > Bests,
> > > > > > > > Samrat
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 16 Feb 2023 at 5:28 PM, Shammon FY <
> zjur...@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi Samrat
> > > > > > > >>
> > > > > > > >> Do you mean to create an independent module for flink
> scaling
> > in
> > > > > > > >> flink-k8s-operator? How about creating a project such as
> > > > > > > >> `flink-auto-scaling` which is completely independent?
> Besides
> > > > > resource
> > > > > > > >> managers such as k8s and yarn, we can do more things in the
> > > > project,
> > > > > > for
> > > > > > > >> example, updating config in the user's `job submission
> system`
> > > > after
> > > > > > > >> scaling flink jobs. WDYT?
> > > > > > > >>
> > > > > > > >> Best,
> > > > > > > >> Shammon
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Thu, Feb 16, 2023 at 7:38 PM Maximilian Michels <
> > > > m...@apache.org>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Hi Samrat,
> > > > > > > >> >
> > > > > > > >> > The autoscaling module is now pluggable but it is still
> > > tightly
> > > > > > > >> > coupled with Kubernetes. It will take additional work for
> > the
> > > > > logic
> > > > > > to
> > > > > > > >> > work independently of the cluster manager.
> > > > > > > >> >
> > > > > > > >> > -Max
> > > > > > > >> >
> > > > > > > >> > On Thu, Feb 16, 2023 at 11:14 AM Samrat Deb <
> > > > > decordea...@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >> > >
> > > > > > > >> > > Oh! yesterday it got merged.
> > > > > > > >> > > Apologies , I missed the recent commit @Gyula.
> > > > > > > >> > >
> > > > > > > >> > > Thanks for the update
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > On Thu, Feb 16, 2023 at 3:17 PM Gyula Fóra <
> > > > > gyula.f...@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Max recently moved the autoscaler logic in a separate
> > > > > submodule,
> > > > > > > did
> > > > > > > >> > you
> > > > > > > >> > > > see that?
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink-kubernetes-operator/commit/5bb8e9dc4dd29e10f3ba7c8ce7cefcdffbf92da4
> > > > > > > >> > > >
> > > > > > > >> > > > Gyula
> > > > > > > >> > > >
> > > > > > > >> > > > On Thu, Feb 16, 2023 at 10:27 AM Samrat Deb <
> > > > > > > decordea...@gmail.com>
> > > > > > > >> > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > > Hi ,
> > > > > > > >> > > > >
> > > > > > > >> > > > > *Context:*
> > > > > > > >> > > > > Auto Scaling was introduced in Flink as part of
> > > > FLIP-271[1].
> > > > > > > >> > > > > It discusses one of the important aspects to
> provide a
> > > > > robust
> > > > > > > >> default
> > > > > > > >> > > > > scaling algorithm.
> > > > > > > >> > > > >       a. Ensure scaling yields effective usage of
> > > assigned
> > > > > > task
> > > > > > > >> > slots.
> > > > > > > >> > > > >       b. Ramp up in case of any backlog to ensure it
> > > gets
> > > > > > > >> processed
> > > > > > > >> > in a
> > > > > > > >> > > > > timely manner
> > > > > > > >> > > > >       c. Minimize the number of scaling decisions to
> > > > prevent
> > > > > > > >> costly
> > > > > > > >> > > > rescale
> > > > > > > >> > > > > operation
> > > > > > > >> > > > > The flip intends to add an auto scaling framework
> > based
> > > > on 6
> > > > > > > major
> > > > > > > >> > > > metrics
> > > > > > > >> > > > > and contains different types of threshold to trigger
> > the
> > > > > > > scaling.
> > > > > > > >> > > > >
> > > > > > > >> > > > > Thread[2] discusses a different problem: why
> > autoscaler
> > > is
> > > > > > part
> > > > > > > of
> > > > > > > >> > the
> > > > > > > >> > > > > operator instead of jobmanager at runtime.
> > > > > > > >> > > > > The Community decided to keep the autoscaling logic
> in
> > > the
> > > > > > > >> > > > > flink-kubernetes-operator.
> > > > > > > >> > > > >
> > > > > > > >> > > > > *Proposal: *
> > > > > > > >> > > > > In this discussion, I want to put forward a thought
> of
> > > > > > > extracting
> > > > > > > >> > out the
> > > > > > > >> > > > > auto scaling logic into a new submodule in
> > > > > > > >> flink-kubernetes-operator
> > > > > > > >> > > > > repository[3],
> > > > > > > >> > > > > which will be independent of any resource
> > > > manager/Operator.
> > > > > > > >> > > > > Currently the Autoscaling algorithm is very tightly
> > > > coupled
> > > > > > with
> > > > > > > >> the
> > > > > > > >> > > > > kubernetes API.
> > > > > > > >> > > > > This makes the autoscaling core algorithm not so
> > easily
> > > > > > > extensible
> > > > > > > >> > for
> > > > > > > >> > > > > different available resource managers like YARN,
> Mesos
> > > > etc.
> > > > > > > >> > > > > A Separate autoscaling module inside the flink
> > > kubernetes
> > > > > > > operator
> > > > > > > >> > will
> > > > > > > >> > > > > help other resource managers to leverage the
> > autoscaling
> > > > > > logic.
> > > > > > > >> > > > >
> > > > > > > >> > > > > [1]
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling
> > > > > > > >> > > > > [2]
> > > > > > > >>
> > > https://lists.apache.org/thread/pvfb3fw99mj8r1x8zzyxgvk4dcppwssz
> > > > > > > >> > > > > [3]
> > https://github.com/apache/flink-kubernetes-operator
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > Bests,
> > > > > > > >> > > > > Samrat
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Extract core autoscaling algorithm as new SubModule in flink-kubernetes-operator

Reply via email to