Re: [DISCUSS] FLIP-271: Autoscaling

Gyula Fóra Sat, 05 Nov 2022 09:47:53 -0700

@JunRui:
There are 2 pieces that prevent scaling on minor load variations. Firstly
he algorithm / logic is intended to work on metrics averaged on a
configured time window (let's say last 5 minutes), this smoothens minor
variances and results in more stability. Secondly, in addition to the
utilization target (let's say 75%) there would be a configurable flexible
bound (for example +/- 10%) and as long as we are within the bound we
wouldn't trigger scaling. These 2 together should be enough we believe.


@Pedro:
Periodic metric collection is very important to get a good overview of the
system at any given time. Also we need some recent history to make a good
scaling decision. Reflecting on your suggestion, when the input lag
increases we are already too late to scale. Ideally the algorithm would
pick up on a gradual load increase before it even results in actual input
lag so we can always keep our target utilization level.

Cheers,
Gyula



On Sat, Nov 5, 2022 at 5:16 PM Pedro Silva <[email protected]> wrote:

> Hello,
>
> First of all thank you for tackling this theme, it is massive boon to
> Flink if it gets in.
>
> Following up on JunRui Lee’s question.
> Have you considered making metrics collection getting triggered based on
> events rather than periodic checks?
>
> I.e if input source lag is increasing for the past x amount of time ->
> trigger a metric collection to understand what to scale, if anything.
>
> For kubernetes loads there is KEDA that does this:
> https://keda.sh/docs/2.8/scalers/prometheus/
>
> My apologies if the question doesn’t make sense.
>
> Thank you for your time,
> Pedro Silva
>
> >
> > On 5 Nov 2022, at 08:09, JunRui Lee <[email protected]> wrote:
> >
> > Hi Max,
> >
> > Thanks for writing this FLIP and initiating the discussion.
> >
> > I just have a small question after reading the FLIP:
> >
> > In the document, I didn't find the definition of when to trigger
> > autoScaling after some jobVertex reach the threshold. If I missed is,
> > please let me know.
> > IIUC, the proper triggering rules are necessary to avoid unnecessary
> > autoscaling caused by temporary large changes in data,
> > and in this case, it will lead to at least two meaningless resubmissions
> of
> > jobs, which will negatively affect users.
> >
> > Thanks,
> > JunRui Lee
> >
> > Gyula Fóra <[email protected]> 于2022年11月5日周六 20:38写道：
> >
> >> Hey!
> >> Thanks for the input!
> >> The algorithm does not really differentiate between scaling up or down
> as
> >> it’s concerned about finding the right parallelism to match the target
> >> processing rate with just enough spare capacity.
> >> Let me try to address your specific points:
> >> 1. The backlog growth rate only matters for computing the target
> processing
> >> rate for the sources. If the parallelism is high enough and there is no
> >> back pressure it will be close to 0 so the target rate is the source
> read
> >> rate. This is as intended. If we see that the sources are not busy and
> they
> >> can read more than enough the algorithm would scale them down.
> >> 2. You are right , it’s dangerous to scale in too much, so we already
> >> thought about limiting the scale down amount per scaling step/time
> window
> >> to give more safety. But we can definitely think about different
> strategies
> >> in the future!
> >> The observation regarding max parallelism is very important and we
> should
> >> always take that into consideration.
> >> Cheers
> >> Gyula
> >>> On Sat, 5 Nov 2022 at 11:46, Biao Geng <[email protected]> wrote:
> >>> Hi Max,
> >>> Thanks a lot for the FLIP. It is an extremely attractive feature!
> >>> Just some follow up questions/thoughts after reading the FLIP:
> >>> In the doc, the discussion of  the strategy of “scaling out” is
> thorough
> >>> and convincing to me but it seems that “scaling down” is less
> discussed.
> >> I
> >>> have 2 cents for this aspect:
> >>> 1.  For source parallelisms, if the user configure a much larger value
> >>> than normal, there should be very little pending records though it is
> >>> possible to get optimized. But IIUC, in current algorithm, we will not
> >> take
> >>> actions for this case as the backlog growth rate is almost zero. Is the
> >>> understanding right?
> >>> 2.  Compared with “scaling out”, “scaling in” is usually more dangerous
> >>> as it is more likely to lead to negative influence to the downstream
> >> jobs.
> >>> The min/max load bounds should be useful. I am wondering if it is
> >> possible
> >>> to have different strategy for “scaling in” to make it more
> conservative.
> >>> Or more eagerly, allow custom autoscaling strategy(e.g. time-based
> >>> strategy).
> >>> Another side thought is that to recover a job from
> checkpoint/savepoint,
> >>> the new parallelism cannot be larger than max parallelism defined in
> the
> >>> checkpoint(see this<
> >>
> https://github.com/apache/flink/blob/17a782c202c93343b8884cb52f4562f9c4ba593f/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L128
> >>> ).
> >>> Not sure if this limit should be mentioned in the FLIP.
> >>> Again, thanks for the great work and looking forward to using flink k8s
> >>> operator with it!
> >>> Best,
> >>> Biao Geng
> >>> From: Maximilian Michels <[email protected]>
> >>> Date: Saturday, November 5, 2022 at 2:37 AM
> >>> To: dev <[email protected]>
> >>> Cc: Gyula Fóra <[email protected]>, Thomas Weise <[email protected]>,
> >>> Marton Balassi <[email protected]>, Őrhidi Mátyás <
> >>> [email protected]>
> >>> Subject: [DISCUSS] FLIP-271: Autoscaling
> >>> Hi,
> >>> I would like to kick off the discussion on implementing autoscaling for
> >>> Flink as part of the Flink Kubernetes operator. I've outlined an
> approach
> >>> here which I find promising:
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling
> >>> I've been discussing this approach with some of the operator
> >> contributors:
> >>> Gyula, Marton, Matyas, and Thomas (all in CC). We started prototyping
> an
> >>> implementation based on the current FLIP design. If that goes well, we
> >>> would like to contribute this to Flink based on the results of the
> >>> discussion here.
> >>> I'm curious to hear your thoughts.
> >>> -Max
>

Re: [DISCUSS] FLIP-271: Autoscaling

Reply via email to