@JunRui: There are 2 pieces that prevent scaling on minor load variations. Firstly he algorithm / logic is intended to work on metrics averaged on a configured time window (let's say last 5 minutes), this smoothens minor variances and results in more stability. Secondly, in addition to the utilization target (let's say 75%) there would be a configurable flexible bound (for example +/- 10%) and as long as we are within the bound we wouldn't trigger scaling. These 2 together should be enough we believe.
@Pedro: Periodic metric collection is very important to get a good overview of the system at any given time. Also we need some recent history to make a good scaling decision. Reflecting on your suggestion, when the input lag increases we are already too late to scale. Ideally the algorithm would pick up on a gradual load increase before it even results in actual input lag so we can always keep our target utilization level. Cheers, Gyula On Sat, Nov 5, 2022 at 5:16 PM Pedro Silva <pedro.cl...@gmail.com> wrote: > Hello, > > First of all thank you for tackling this theme, it is massive boon to > Flink if it gets in. > > Following up on JunRui Lee’s question. > Have you considered making metrics collection getting triggered based on > events rather than periodic checks? > > I.e if input source lag is increasing for the past x amount of time -> > trigger a metric collection to understand what to scale, if anything. > > For kubernetes loads there is KEDA that does this: > https://keda.sh/docs/2.8/scalers/prometheus/ > > My apologies if the question doesn’t make sense. > > Thank you for your time, > Pedro Silva > > > > > On 5 Nov 2022, at 08:09, JunRui Lee <jrlee....@gmail.com> wrote: > > > > Hi Max, > > > > Thanks for writing this FLIP and initiating the discussion. > > > > I just have a small question after reading the FLIP: > > > > In the document, I didn't find the definition of when to trigger > > autoScaling after some jobVertex reach the threshold. If I missed is, > > please let me know. > > IIUC, the proper triggering rules are necessary to avoid unnecessary > > autoscaling caused by temporary large changes in data, > > and in this case, it will lead to at least two meaningless resubmissions > of > > jobs, which will negatively affect users. > > > > Thanks, > > JunRui Lee > > > > Gyula Fóra <gyula.f...@gmail.com> 于2022年11月5日周六 20:38写道: > > > >> Hey! > >> Thanks for the input! > >> The algorithm does not really differentiate between scaling up or down > as > >> it’s concerned about finding the right parallelism to match the target > >> processing rate with just enough spare capacity. > >> Let me try to address your specific points: > >> 1. The backlog growth rate only matters for computing the target > processing > >> rate for the sources. If the parallelism is high enough and there is no > >> back pressure it will be close to 0 so the target rate is the source > read > >> rate. This is as intended. If we see that the sources are not busy and > they > >> can read more than enough the algorithm would scale them down. > >> 2. You are right , it’s dangerous to scale in too much, so we already > >> thought about limiting the scale down amount per scaling step/time > window > >> to give more safety. But we can definitely think about different > strategies > >> in the future! > >> The observation regarding max parallelism is very important and we > should > >> always take that into consideration. > >> Cheers > >> Gyula > >>> On Sat, 5 Nov 2022 at 11:46, Biao Geng <biaoge...@gmail.com> wrote: > >>> Hi Max, > >>> Thanks a lot for the FLIP. It is an extremely attractive feature! > >>> Just some follow up questions/thoughts after reading the FLIP: > >>> In the doc, the discussion of the strategy of “scaling out” is > thorough > >>> and convincing to me but it seems that “scaling down” is less > discussed. > >> I > >>> have 2 cents for this aspect: > >>> 1. For source parallelisms, if the user configure a much larger value > >>> than normal, there should be very little pending records though it is > >>> possible to get optimized. But IIUC, in current algorithm, we will not > >> take > >>> actions for this case as the backlog growth rate is almost zero. Is the > >>> understanding right? > >>> 2. Compared with “scaling out”, “scaling in” is usually more dangerous > >>> as it is more likely to lead to negative influence to the downstream > >> jobs. > >>> The min/max load bounds should be useful. I am wondering if it is > >> possible > >>> to have different strategy for “scaling in” to make it more > conservative. > >>> Or more eagerly, allow custom autoscaling strategy(e.g. time-based > >>> strategy). > >>> Another side thought is that to recover a job from > checkpoint/savepoint, > >>> the new parallelism cannot be larger than max parallelism defined in > the > >>> checkpoint(see this< > >> > https://github.com/apache/flink/blob/17a782c202c93343b8884cb52f4562f9c4ba593f/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L128 > >>> ). > >>> Not sure if this limit should be mentioned in the FLIP. > >>> Again, thanks for the great work and looking forward to using flink k8s > >>> operator with it! > >>> Best, > >>> Biao Geng > >>> From: Maximilian Michels <m...@apache.org> > >>> Date: Saturday, November 5, 2022 at 2:37 AM > >>> To: dev <dev@flink.apache.org> > >>> Cc: Gyula Fóra <gyula.f...@gmail.com>, Thomas Weise <t...@apache.org>, > >>> Marton Balassi <mbala...@apache.org>, Őrhidi Mátyás < > >>> matyas.orh...@gmail.com> > >>> Subject: [DISCUSS] FLIP-271: Autoscaling > >>> Hi, > >>> I would like to kick off the discussion on implementing autoscaling for > >>> Flink as part of the Flink Kubernetes operator. I've outlined an > approach > >>> here which I find promising: > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling > >>> I've been discussing this approach with some of the operator > >> contributors: > >>> Gyula, Marton, Matyas, and Thomas (all in CC). We started prototyping > an > >>> implementation based on the current FLIP design. If that goes well, we > >>> would like to contribute this to Flink based on the results of the > >>> discussion here. > >>> I'm curious to hear your thoughts. > >>> -Max >