Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes

Rui Fan Wed, 06 Sep 2023 02:25:53 -0700

Hi Max,

As the FLIP mentioned, we have the plan to add the
alternative implementation.


First of all, we will develop a generic autoscaler. This generic
autoscaler will not have knowledge of specific jobs, and users
will have the flexibility to pass the JobAutoScalerContext
when utilizing the generic autoscaler. Communication with
Flink jobs can be achieved through the RestClusterClient.

   - The generic ScalingRealizer is based on the rescale API (FLIP-291).
   - The generic EventHandler is based on the logger.
   - The generic StateStore is based on the Heap. This means that the state
   information is stored in memory and can be lost if the autoscaler restarts.


Secondly, for yarn implementation, as Samrat mentioned,
There is currently no flink-yarn-operator, and we cannot
easily obtain the job list. We are not yet sure how to manage
yarn's flink jobs. In order to prevent the FLIP from being too huge,
after confirming with Gyula and Samrat before, it is decided
that the current FLIP will not implement the automated
yarn-autoscaler. And it will be a separate FLIP in the future.


After this part is finished, flink users or other flink platforms can easy
to use the autoscaler, they just pass the Context, and the autoscaler
can find the flink job using the RestClient.

The first part will be done in this FLIP. And we can discuss
whether the second part should be done in this FLIP as well.

Best,
Rui

On Wed, Sep 6, 2023 at 4:34 AM Samrat Deb <[email protected]> wrote:

> Hi Max,
>
> > are we planning to add an alternative implementation
> against the new interfaces?
>
> Yes, we are simultaneously working on the YARN implementation using the
> interface. During the initial interface design, we encountered some
> anomalies while implementing it in YARN.
>
> Once the interfaces are finalized, we will proceed to raise a pull request
> (PR) for YARN as well.
>
> Our initial approach was to create a decoupled interface as part of
> FLIP-334 and then implement it for YARN in the subsequent phase.
> However, if you recommend combining both phases, we can certainly consider
> that option.
>
> We look forward to hearing your thoughts on whether to have YARN
> implementation as part of FLIP-334 or seperate one ?
>
> Bests
> Samrat
>
>
>
> On Tue, Sep 5, 2023 at 8:41 PM Maximilian Michels <[email protected]> wrote:
>
> > Thanks Rui for the update!
> >
> > Alongside with the refactoring to decouple autoscaler logic from the
> > deployment logic, are we planning to add an alternative implementation
> > against the new interfaces? I think the best way to get the interfaces
> > right, is to have an alternative implementation in addition to
> > Kubernetes. YARN or a standalone mode implementation were already
> > mentioned. Ultimately, this is the reason we are doing the
> > refactoring. Without a new implementation, it becomes harder to
> > justify the refactoring work.
> >
> > Cheers,
> > Max
> >
> > On Tue, Sep 5, 2023 at 9:48 AM Rui Fan <[email protected]> wrote:
> > >
> > > After discussing this FLIP-334[1] offline with Gyula and Max,
> > > I updated the FLIP based on the latest conclusion.
> > >
> > > Big thanks to Gyula and Max for their professional advice!
> > >
> > > > Does the interface function of handlerRecommendedParallelism
> > > > in AutoScalerEventHandler conflict with
> > > > handlerScalingFailure/handlerScalingReport (one of the
> > > > handles the event of scale failure, and the other handles
> > > > the event of scale success).
> > > Hi Matt,
> > >
> > > You can take a look at the FLIP, I think the issue has been fixed.
> > > Currently, we introduced the ScalingRealizer and
> > > AutoScalerEventHandler interface.
> > >
> > > The ScalingRealizer handles scaling action.
> > >
> > > The AutoScalerEventHandler  interface handles loggable events.
> > >
> > >
> > > Looking forward to your feedback, thanks!
> > >
> > > [1] https://cwiki.apache.org/confluence/x/x4qzDw
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Aug 24, 2023 at 10:55 AM Matt Wang <[email protected]> wrote:
> > >>
> > >> Sorry for the late reply, I still have a small question here:
> > >> Does the interface function of handlerRecommendedParallelism
> > >> in AutoScalerEventHandler conflict with
> > >> handlerScalingFailure/handlerScalingReport (one of the
> > >> handles the event of scale failure, and the other handles
> > >> the event of scale success).
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Matt Wang
> > >>
> > >>
> > >> ---- Replied Message ----
> > >> | From | Rui Fan<[email protected]> |
> > >> | Date | 08/21/2023 17:41 |
> > >> | To | <[email protected]> |
> > >> | Cc | Maximilian Michels<[email protected]> ,
> > >> Gyula Fóra<[email protected]> ,
> > >> Matt Wang<[email protected]> |
> > >> | Subject | Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and
> > kubernetes |
> > >> Hi Max, Gyula and Matt,
> > >>
> > >> Do you have any other comments?
> > >>
> > >> The flink-kubernetes-operator 1.6 has been released recently,
> > >> it's a good time to kick off this FLIP.
> > >>
> > >> Please let me know if you have any questions or concerns,
> > >> looking forward to your feedback, thanks!
> > >>
> > >> Best,
> > >> Rui
> > >>
> > >> On Wed, Aug 9, 2023 at 11:55 AM Rui Fan <[email protected]> wrote:
> > >>
> > >> Hi Matt Wang,
> > >>
> > >> Thanks for your discussion here.
> > >>
> > >> it is recommended to unify the descriptions of AutoScalerHandler
> > >> and AutoScalerEventHandler in the FLIP
> > >>
> > >> Good catch, I have updated all AutoScalerHandler to
> > >> AutoScalerEventHandler.
> > >>
> > >> Can it support the use of zookeeper (zookeeper is a relatively
> > >> common use of flink HA)?
> > >>
> > >> In my opinion, it's a good suggestion. However, I prefer we
> > >> implement other state stores in the other FLINK JIRA, and
> > >> this FLIP focus on the decoupling and implementing the
> > >> necessary state store. Does that make sense?
> > >>
> > >> Regarding each scaling information, can it be persisted in
> > >> the shared file system through the filesystem? I think it will
> > >> be a more valuable requirement to support viewing
> > >> Autoscaling info on the UI in the future, which can provide
> > >> some foundations in advance;
> > >>
> > >> This is a good suggestion as well. It's useful for users to check
> > >> the scaling information. I propose to add a CompositeEventHandler,
> > >> it can include multiple EventHandlers.
> > >>
> > >> However, as the last question, I prefer we implement other
> > >> event handler in the other FLINK JIRA. What do you think?
> > >>
> > >> A solution mentioned in FLIP is to initialize the
> > >> AutoScalerEventHandler object every time an event is
> > >> processed.
> > >>
> > >> No, the FLIP mentioned `The AutoScalerEventHandler  object is shared
> for
> > >> all flink jobs`,
> > >> So the AutoScalerEventHandler is only initialized once.
> > >>
> > >> And we call the AutoScalerEventHandler#handlerXXX
> > >> every time an event is processed.
> > >>
> > >> Best,
> > >> Rui
> > >>
> > >> On Tue, Aug 8, 2023 at 9:40 PM Matt Wang <[email protected]> wrote:
> > >>
> > >> Hi Rui
> > >>
> > >> Thanks for driving the FLIP.
> > >>
> > >> I agree with the point fo this FLIP. This FLIP first provides a
> > >> general function of Autoscaler in Flink repo, and there is no
> > >> need to move kubernetes-autoscaler from kubernetes-operator
> > >> to Flink repo in this FLIP(it is recommended to unify the
> > >> descriptions of AutoScalerHandler and AutoScalerEventHandler
> > >> in the FLIP). Here I still have a few questions:
> > >>
> > >> 1. AutoScalerStateStore mainly records the state information
> > >> during Scaling. In addition to supporting the use of configmap,
> > >> can it support the use of zookeeper (zookeeper is a relatively
> > >> common use of flink HA)?
> > >> 2. Regarding each scaling information, can it be persisted in
> > >> the shared file system through the filesystem? I think it will
> > >> be a more valuable requirement to support viewing
> > >> Autoscaling info on the UI in the future, which can provide
> > >> some foundations in advance;
> > >> 3. A solution mentioned in FLIP is to initialize the
> > >> AutoScalerEventHandler object every time an event is
> > >> processed. What is the main purpose of this solution?
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Matt Wang
> > >>
> > >>
> > >> ---- Replied Message ----
> > >> | From | Rui Fan<[email protected]> |
> > >> | Date | 08/7/2023 11:34 |
> > >> | To | <[email protected]> |
> > >> | Cc | [email protected]<[email protected]> ,
> > >> Gyula Fóra<[email protected]> |
> > >> | Subject | Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and
> > kubernetes
> > >> |
> > >> Hi Ron:
> > >>
> > >> Thanks for the feedback! The goal is indeed to turn the autoscaler
> into
> > >> a general tool that can support other resource management.
> > >>
> > >>
> > >> Hi Max, Gyula:
> > >>
> > >> My proposed `AutoScalerStateStore` is similar to Map, it can really be
> > >> improved.
> > >>
> > >> public interface AutoScalerStateStore {
> > >> Map<String, String> getState(KEY jobKey)
> > >> void updateState(KEY jobKey, Map<String, String> state);
> > >> }
> > >>
> > >> From the method parameter, the StateStore is shared by all jobs,
> right?
> > >> If yes, the `KEY jobKey` isn't enough because the CR is needed during
> > >> creating the ConfigMap. The `jobKey` is ResourceID, the extraJobInfo
> is
> > >> CR.
> > >>
> > >> So, this parameter may need to be changed from `KEY jobKey` to
> > >> `JobAutoScalerContext<KEY, INFO>`. Does that make sense?
> > >> If yes, I can update the interface in the FLIP doc.
> > >>
> > >> Best,
> > >> Rui
> > >>
> > >> On Mon, Aug 7, 2023 at 10:18 AM liu ron <[email protected]> wrote:
> > >>
> > >> Hi, Rui
> > >>
> > >> Thanks for driving the FLIP.
> > >>
> > >> The tuning of streaming jobs by autoscaler is very important. Although
> > the
> > >> mainstream trend now is cloud-native, many companies still run their
> > Flink
> > >> jobs on Yarn for historical reasons. If we can decouple autoscaler
> from
> > >> K8S
> > >> and turn it into a common tool that can support other resource
> > management
> > >> frameworks such as Yarn, I think it will be very meaningful.
> > >> +1 for this proposal.
> > >>
> > >> Best,
> > >> Ron
> > >>
> > >>
> > >> Gyula Fóra <[email protected]> 于2023年8月5日周六 15:03写道：
> > >>
> > >> Hi Rui!
> > >>
> > >> Thanks for the proposal.
> > >>
> > >> I agree with Max on that the state store abstractions could be
> improved
> > >> and
> > >> be more specific as we know what goes into the state. It could simply
> be
> > >>
> > >> public interface AutoScalerStateStore {
> > >> Map<String, String> getState(KEY jobKey)
> > >> void updateState(KEY jobKey, Map<String, String> state);
> > >> }
> > >>
> > >>
> > >> We could also remove the entire recommended parallelism logic from the
> > >> interface and make it internal to the implementation somehow because
> > it's
> > >> not very nice in the current form.
> > >>
> > >> Cheers,
> > >> Gyula
> > >>
> > >> On Fri, Aug 4, 2023 at 7:05 AM Rui Fan <[email protected]> wrote:
> > >>
> > >> Hi Max,
> > >>
> > >> After careful consideration, I prefer to keep the AutoScalerStateStore
> > >> instead of AutoScalerEventHandler taking over the work of
> > >> AutoScalerStateStore. And the following are some reasons:
> > >>
> > >> 1. Keeping the AutoScalerStateStore to make StateStore easy to plug
> in.
> > >>
> > >> Currently, the kubernetes-operator-autoscaler uses the ConfigMap as
> the
> > >> state store. However, users may use a different state store for
> > >> yarn-autoscaler or generic autoscaler. Such as: MySQL StateStore,
> > >> Heaped StateStore or PostgreSQL StateStore, etc.
> > >>
> > >> Of course, kubernetes autoscaler can also use the MySQL StateStore.
> > >> If the AutoScalerEventHandler is responsible for recording events,
> > >> scaling job and accessing state, whenever users or community want to
> > >> create a new state store, they must also implement the new
> > >> AutoScalerEventHandler, it includes recording events and scaling job.
> > >>
> > >> If we decouple AutoScalerEventHandler and AutoScalerStateStore,
> > >> it's easy to implement a new state store.
> > >>
> > >> 2. AutoScalerEventHandler isn't suitable for access state.
> > >>
> > >> Sometimes the generic autoscaler needs to query state,
> > >> AutoScalerEventHandler can update the state during handling events.
> > >> However, it's wired to provide a series of interfaces to query state.
> > >>
> > >> What do you think?
> > >>
> > >> And looking forward to more thoughts from the community, thanks!
> > >>
> > >> Best,
> > >> Rui Fan
> > >>
> > >> On Tue, Aug 1, 2023 at 11:47 PM Rui Fan <[email protected]> wrote:
> > >>
> > >> Hi Max,
> > >>
> > >> Thanks for your quick response!
> > >>
> > >> 1. Handle state in the AutoScalerEventHandler which will receive
> > >> all related scaling and metric collection events, and can keep
> > >> track of any state.
> > >>
> > >> If I understand correctly, you mean that updating state is just part
> > >> of
> > >> handling events, right?
> > >>
> > >> If yes, sounds make sense. However, I have some concerns:
> > >>
> > >> - Currently, we have 3 key-values that need to be stored. And the
> > >> autoscaler needs to get them first, then update them, and sometimes
> > >> remove them. If we use AutoScalerEventHandler, we need to provided
> > >> 9 methods, right? Every key has 3 methods.
> > >> - Do we add the persistState interface for AutoScalerEventHandler to
> > >> persist in-memory state to kubernetes?
> > >>
> > >> 2. In the long run, the autoscaling logic can move to a
> > >> separate repository, although this will complicate the release
> > >> process, so I would defer this unless there is strong interest.
> > >>
> > >> I also agree to leave it in flink-k8s-operator for now. Unless moving
> > >> it
> > >> out of flink-k8s-operator is necessary in the future.
> > >>
> > >> 3. The proposal mentions some removal of tests.
> > >>
> > >> Sorry, I didn't express clearly in FLIP. POC just check whether these
> > >> interfaces work well. It will take more time if I develop all the
> > >> tests
> > >> during POC. So I removed these tests in my POC.
> > >>
> > >> These tests will be completed in the final PR, and the test is very
> > >> useful
> > >> for less bugs.
> > >>
> > >> Best,
> > >> Rui Fan
> > >>
> > >> On Tue, Aug 1, 2023 at 10:10 PM Maximilian Michels <[email protected]>
> > >> wrote:
> > >>
> > >> Hi Rui,
> > >>
> > >> Thanks for the proposal. I think it makes a lot of sense to decouple
> > >> the autoscaler from Kubernetes-related dependencies. A couple of
> > >> notes
> > >> when I read the proposal:
> > >>
> > >> 1. You propose AutoScalerEventHandler, AutoScalerStateStore,
> > >> AutoScalerStateStoreFactory, and AutoScalerEventHandler.
> > >> AutoscalerStateStore is a generic key/value database (methods:
> > >> "get"/"put"/"delete"). I would propose to refine this interface and
> > >> make it less general purpose, e.g. add a method for persisting
> > >> scaling
> > >> decisions as well as any metrics gathered for the current metric
> > >> window. For simplicity, I'd even go so far to remove the state store
> > >> entirely, but rather handle state in the AutoScalerEventHandler
> > >> which
> > >> will receive all related scaling and metric collection events, and
> > >> can
> > >> keep track of any state.
> > >>
> > >> 2. You propose to make the current autoscaler module
> > >> Kubernetes-agnostic by moving the Kubernetes parts into the main
> > >> operator module. I think that makes sense since the Kubernetes
> > >> implementation will continue to be tightly coupled with Kubernetes.
> > >> The goal of the separate module was to make the autoscaler logic
> > >> pluggable, but this will continue to be possible with the new
> > >> "flink-autoscaler" module which contains the autoscaling logic and
> > >> interfaces. In the long run, the autoscaling logic can move to a
> > >> separate repository, although this will complicate the release
> > >> process, so I would defer this unless there is strong interest.
> > >>
> > >> 3. The proposal mentions some removal of tests. It is critical for
> > >> us
> > >> that all test coverage of the current implementation remains active.
> > >> It is ok if some of the test coverage only covers the Kubernetes
> > >> implementation. We can eventually move more tests without Kubernetes
> > >> significance into the implementation-agnostic autoscaler tests.
> > >>
> > >> -Max
> > >>
> > >> On Tue, Aug 1, 2023 at 9:46 AM Rui Fan <[email protected]> wrote:
> > >>
> > >> Hi all,
> > >>
> > >> I and Samrat(cc'ed) created the FLIP-334[1] to decoupling the
> > >> autoscaler
> > >> and kubernetes.
> > >>
> > >> Currently, the flink-autoscaler is tightly integrated with
> > >> Kubernetes.
> > >> There are compelling reasons to extend the use of flink-autoscaler
> > >> to
> > >> more types of Flink jobs:
> > >> 1. With the recent merge of the Externalized Declarative Resource
> > >> Management (FLIP-291[2]), in-place scaling is now supported
> > >> across all types of Flink jobs. This development has made scaling
> > >> Flink
> > >> on
> > >> YARN a straightforward process.
> > >> 2. Several discussions[3] within the Flink user community, as
> > >> observed
> > >> in
> > >> the mail list , have emphasized the necessity of flink-autoscaler
> > >> supporting
> > >> Flink on YARN.
> > >>
> > >> Please refer to the FLIP[1] document for more details about the
> > >> proposed
> > >> design and implementation. We welcome any feedback and opinions on
> > >> this proposal.
> > >>
> > >> [1] https://cwiki.apache.org/confluence/x/x4qzDw
> > >> [2]
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management
> > >> [3]
> > >> https://lists.apache.org/thread/pr0r8hq8kqpzk3q1zrzkl3rp1lz24v7v
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> >
>

Re: [DISCUSS] FLIP-334 : Decoupling autoscaler and kubernetes

Reply via email to