Re: [Question] How to scale application based on 'reactive' mode

Gyula Fóra Fri, 01 Sep 2023 06:21:17 -0700

Pretty much, except that with Flink 1.18 autoscaler can scale the job in
place without restarting the JM (even without reactive mode )


So actually best option is autoscaler with Flink 1.18 native mode (no
reactive)

Gyula

On Fri, 1 Sep 2023 at 13:54, Dennis Jung <inylov...@gmail.com> wrote:

> Thanks for feedback.
> Could you check whether I understand correctly?
>
> *Only using 'reactive' mode:*
> By manually adding TaskManager(TM) (such as using './bin/taskmanager.sh
> start'), parallelism will be increased. For example, when job parallelism
> is 1 and TM is 1, and if adding 1 new TM, JobManager will be restarted and
> parallelism will be 2.
> But the number of TM is not being controlled automatically.
>
> *Autoscaler + non-reactive:*
> It can flexibilly control the number of TM by several metrics(CPU usage,
> throughput, ...), and JobManager will be restarted when scaling. But job
> parallelism is the same after the number of TM has been changed.
>
> *Autoscaler + 'reactive' mode*:
> It can control numbers of TM by metric, and increase/decrease job
> parallelism by changing TM.
>
> Regards,
> Jung
>
> 2023년 9월 1일 (금) 오후 8:16, Gyula Fóra <gyula.f...@gmail.com>님이 작성:
>
>> I would look at reactive scaling as a way to increase / decrease
>> parallelism.
>>
>> It’s not a way to automatically decide when to actually do it as you need
>> to create new TMs .
>>
>> The autoscaler could use reactive mode to change the parallelism but you
>> need the autoscaler itself to decide when new resources should be added
>>
>> On Fri, 1 Sep 2023 at 13:09, Dennis Jung <inylov...@gmail.com> wrote:
>>
>>> For now, the thing I've found about 'reactive' mode is that it
>>> automatically adjusts 'job parallelism' when TaskManager is
>>> increased/decreased.
>>>
>>>
>>> https://www.slideshare.net/FlinkForward/autoscaling-flink-with-reactive-mode
>>>
>>> Is there some other feature that only 'reactive' mode offers for scaling?
>>>
>>> Thanks.
>>> Regards.
>>>
>>>
>>>
>>> 2023년 9월 1일 (금) 오후 4:56, Dennis Jung <inylov...@gmail.com>님이 작성:
>>>
>>>> Hello,
>>>> Thank you for your response. I have few more questions in following:
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/elastic_scaling/
>>>>
>>>> *Reactive Mode configures a job so that it always uses all resources
>>>> available in the cluster. Adding a TaskManager will scale up your job,
>>>> removing resources will scale it down. Flink will manage the parallelism of
>>>> the job, always setting it to the highest possible values.*
>>>> => Does this mean when I add/remove TaskManager in 'non-reactive' mode,
>>>> resource(CPU/Memory/Etc.) of the cluster is not being changed?
>>>>
>>>> *Reactive Mode restarts a job on a rescaling event, restoring it from
>>>> the latest completed checkpoint. This means that there is no overhead of
>>>> creating a savepoint (which is needed for manually rescaling a job). Also,
>>>> the amount of data that is reprocessed after rescaling depends on the
>>>> checkpointing interval, and the restore time depends on the state size.*
>>>> => As I know 'rescaling' also works in non-reactive mode, with
>>>> restoring checkpoint. What is the difference of using 'reactive' here?
>>>>
>>>> *The Reactive Mode allows Flink users to implement a powerful
>>>> autoscaling mechanism, by having an external service monitor certain
>>>> metrics, such as consumer lag, aggregate CPU utilization, throughput or
>>>> latency. As soon as these metrics are above or below a certain threshold,
>>>> additional TaskManagers can be added or removed from the Flink cluster.*
>>>> => Why is this only possible in 'reactive' mode? Seems this is more
>>>> related to 'autoscaler'. Are there some specific features/API which can
>>>> control TaskManager/Parallelism only in 'reactive' mode?
>>>>
>>>> Thank you.
>>>>
>>>> 2023년 9월 1일 (금) 오후 3:30, Gyula Fóra <gyula.f...@gmail.com>님이 작성:
>>>>
>>>>> The reactive mode reacts to available resources. The autoscaler reacts
>>>>> to changing load and processing capacity and adjusts resources.
>>>>>
>>>>> Completely different concepts and applicability.
>>>>> Most people want the autoscaler , but this is a recent feature and is
>>>>> specific to the k8s operator at the moment.
>>>>>
>>>>> Gyula
>>>>>
>>>>> On Fri, 1 Sep 2023 at 04:50, Dennis Jung <inylov...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>> Thanks for your notice.
>>>>>>
>>>>>> Than what is the purpose of using 'reactive', if this doesn't do
>>>>>> anything itself?
>>>>>> What is the difference if I use auto-scaler without 'reactive' mode?
>>>>>>
>>>>>> Regards,
>>>>>> Jung
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2023년 8월 18일 (금) 오후 7:51, Gyula Fóra <gyula.f...@gmail.com>님이 작성:
>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> I think what you need is probably not the reactive mode but a proper
>>>>>>> autoscaler. The reactive mode as you say doesn't do anything in itself, 
>>>>>>> you
>>>>>>> need to build a lot of logic around it.
>>>>>>>
>>>>>>> Check this instead:
>>>>>>> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/
>>>>>>>
>>>>>>> The Kubernetes Operator has a built in autoscaler that can scale
>>>>>>> jobs based on kafka data rate / processing throughput. It also doesn't 
>>>>>>> rely
>>>>>>> on the reactive mode.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Gyula
>>>>>>>
>>>>>>> On Fri, Aug 18, 2023 at 12:43 PM Dennis Jung <inylov...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>> Sorry for frequent questions. This is a question about 'reactive'
>>>>>>>> mode.
>>>>>>>>
>>>>>>>> 1. As far as I understand, though I've setup `scheduler-mode:
>>>>>>>> reactive`, it will not change parallelism automatically by itself, by 
>>>>>>>> CPU
>>>>>>>> usage or Kafka consumer rate. It needs additional resource monitor 
>>>>>>>> features
>>>>>>>> (such as Horizontal Pod Autoscaler, or else). Is this correct?
>>>>>>>> 2. Is it possible to create a custom resource monitor provider
>>>>>>>> application? For example, if I want to increase/decrease parallelism by
>>>>>>>> Kafka consumer rate, do I need to send specific API from outside, to 
>>>>>>>> order
>>>>>>>> rescaling?
>>>>>>>> 3. If 2 is correct, what is the difference when using 'reactive'
>>>>>>>> mode? Because as far as I think, calling a specific API will rescale 
>>>>>>>> either
>>>>>>>> using 'reactive' mode or not...(or is the API just working based on 
>>>>>>>> this
>>>>>>>> mode)?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>>

Re: [Question] How to scale application based on 'reactive' mode

Reply via email to