Re: Flink operator autoscaler scaling down

Yang LI Wed, 15 Nov 2023 11:40:46 -0800

Thanks Maximilian and Gyula, I'll keep you updated.

Best,
Yang


On Sat, 11 Nov 2023 at 16:18, Maximilian Michels <m...@apache.org> wrote:

> Hi Yang,
>
> We're always open to changes / additions to the autoscaler logic and
> metric collection. Preferably, we change these directly in the
> autoscaler implementation, without adding additional processes or
> controllers. Let us know how your experiments go! If you want to
> contribute, a JIRA with a description of the changes would be the
> first step. We can take it from there.
>
> Cheers,
> Max
>
> On Tue, Nov 7, 2023 at 9:04 PM Yang LI <yang.hunter...@gmail.com> wrote:
> >
> > Hi Gyula,
> >
> > Thank you for the feedback! With your permission, I plan to integrate
> the implementation into the flink-kubernetes-operator-autoscaler module to
> test it on my env. Subsequently, maybe contribute these changes back to the
> community by submitting a pull request to the GitHub repository in the
> coming months.
> >
> > Best,
> > Yang
> >
> > On Tue, 7 Nov 2023 at 19:08, Gyula Fóra <gyula.f...@gmail.com> wrote:
> >>
> >> Sounds like a lot of work for very little gain to me. If you really
> feel that there is some room for improvement with the current
> implementation, it may be simpler to fix that .
> >>
> >> Gyula
> >>
> >> On Tue, 7 Nov 2023 at 01:20, Yang LI <yang.hunter...@gmail.com> wrote:
> >>>
> >>> Thanks for the information!
> >>>
> >>> I haven't tested Kuberntes's built-in rollback mechanism yet. I feel
> like I can create another independent operator which detects flink
> application jvm memory and triggers rollback.
> >>>
> >>> Another solution I would like to discuss is also to implement an
> independent operator. This operator do following things:
> >>>
> >>> Retrieve the state size metrics for Flink applications from Prometheus.
> >>> Gather current and recommended parallelism metrics from the Flink
> operator, also reported in Prometheus.
> >>> If a downscale is advised, I would calculate whether the new cluster
> configuration, considering state size and JVM heap max size, can support
> the entire state; if not, the operation would be halted.
> >>> If feasible, this operator would manage the rescaling process
> similarly to the Flink operator, either by making API requests or by
> applying a kubectl patch to the FlinkDeployment CRD.
> >>>
> >>> By doing this we could achieve something similar to what we can do
> with a plugin system, Of course in this case I'll disable scaling of the
> flink operator, Do you think it could work?
> >>>
> >>> Best,
> >>> Yang
> >>>
> >>> On Mon, 6 Nov 2023 at 23:43, Gyula Fóra <gyula.f...@gmail.com> wrote:
> >>>>
> >>>> Hey!
> >>>>
> >>>> Bit of a tricky problem, as it's not really possible to know that the
> job will be able to start with lower parallelism in some cases. Custom
> plugins may work but that would be an extremely complex solution at this
> point.
> >>>>
> >>>> The Kubernetes operator has a built-in rollback mechanism that can
> help with rolling back these broken scale operations, have you tried that?
> Furthermore we are planning to introduce some heap/GC related metrics soon
> (probably after the next release for 1.8.0) that may help us catching these
> issues.
> >>>>
> >>>> Cheers,
> >>>> Gyula
> >>>>
> >>>> On Mon, Nov 6, 2023 at 9:27 AM Yang LI <yang.hunter...@gmail.com>
> wrote:
> >>>>>
> >>>>> Dear Flink Community,
> >>>>>
> >>>>> I am currently working on implementing auto-scaling for my Flink
> application using the Flink operator's autoscaler. During testing, I
> encountered a "java.lang.OutOfMemoryError: Java heap space" exception when
> the autoscaler attempted to scale down. This issue arises when the incoming
> record rate decreases while the state size has not yet reduced
> correspondingly. Despite numerous tests, managing this issue has been
> difficult due to the lack of a parameter that allows for specifying a
> cooldown period(essential for processing and reducing state size)prior to
> actual scaling down. Moreover, determining an optimal duration for this
> cooldown period is also not straightforward. I believe that enhancing the
> autoscaler with a focus on memory checks or more broadly on stability
> conditions could significantly address this issue.. Here are some potential
> solutions that, in my opinion, could improve the situation:
> >>>>>
> >>>>> Integrate heap memory-related metrics into the metric collection,
> coupled with a memory safety margin check within the autoscaler's algorithm.
> >>>>>
> >>>>> Introduce a plugin system and a pre-rescaling step in the Flink
> operator's autoscaler, which would allow users to implement custom plugins.
> These plugins could host listeners that activate during the pre-hook step,
> adding an additional checkpoint before the algorithm executes. So we can
> keep blocking scaling down until custom checks are passed to ensure it is
> safe to proceed with scaling down.
> >>>>>
> >>>>> Implement a parameter that establishes a stability threshold for
> heap memory usage percentage or jvm old gc (duration or count). In the
> event that the threshold is exceeded, the system would revert to the last
> stable scale in the scaling history. Then the stabilization interval would
> start to work, providing the Flink cluster with additional time to process
> and reduce the state size
> >>>>>
> >>>>>
> >>>>>
> >>>>> Let me know what you think about it! Thanks!
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Yang LI
>

Re: Flink operator autoscaler scaling down

Reply via email to