Hello, Thanks for your notice. 1. In "Flink 1.18 + non-reactive", is parallelism being changed by the number of TM? 2. In the document( https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.6/docs/custom-resource/autoscaler/), it said "we are not using any container memory / CPU utilization metrics directly here". Which metrics are these using internally? 3. I'm using standalone k8s( https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/) for deployment. Is autoscaler features only available by using the "flink k8s operator"(sorry I don't understand this clearly yet...)?
Regards 2023년 9월 1일 (금) 오후 10:20, Gyula Fóra <gyula.f...@gmail.com>님이 작성: > Pretty much, except that with Flink 1.18 autoscaler can scale the job in > place without restarting the JM (even without reactive mode ) > > So actually best option is autoscaler with Flink 1.18 native mode (no > reactive) > > Gyula > > On Fri, 1 Sep 2023 at 13:54, Dennis Jung <inylov...@gmail.com> wrote: > >> Thanks for feedback. >> Could you check whether I understand correctly? >> >> *Only using 'reactive' mode:* >> By manually adding TaskManager(TM) (such as using './bin/taskmanager.sh >> start'), parallelism will be increased. For example, when job parallelism >> is 1 and TM is 1, and if adding 1 new TM, JobManager will be restarted and >> parallelism will be 2. >> But the number of TM is not being controlled automatically. >> >> *Autoscaler + non-reactive:* >> It can flexibilly control the number of TM by several metrics(CPU usage, >> throughput, ...), and JobManager will be restarted when scaling. But job >> parallelism is the same after the number of TM has been changed. >> >> *Autoscaler + 'reactive' mode*: >> It can control numbers of TM by metric, and increase/decrease job >> parallelism by changing TM. >> >> Regards, >> Jung >> >> 2023년 9월 1일 (금) 오후 8:16, Gyula Fóra <gyula.f...@gmail.com>님이 작성: >> >>> I would look at reactive scaling as a way to increase / decrease >>> parallelism. >>> >>> It’s not a way to automatically decide when to actually do it as you >>> need to create new TMs . >>> >>> The autoscaler could use reactive mode to change the parallelism but you >>> need the autoscaler itself to decide when new resources should be added >>> >>> On Fri, 1 Sep 2023 at 13:09, Dennis Jung <inylov...@gmail.com> wrote: >>> >>>> For now, the thing I've found about 'reactive' mode is that it >>>> automatically adjusts 'job parallelism' when TaskManager is >>>> increased/decreased. >>>> >>>> >>>> https://www.slideshare.net/FlinkForward/autoscaling-flink-with-reactive-mode >>>> >>>> Is there some other feature that only 'reactive' mode offers for >>>> scaling? >>>> >>>> Thanks. >>>> Regards. >>>> >>>> >>>> >>>> 2023년 9월 1일 (금) 오후 4:56, Dennis Jung <inylov...@gmail.com>님이 작성: >>>> >>>>> Hello, >>>>> Thank you for your response. I have few more questions in following: >>>>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/elastic_scaling/ >>>>> >>>>> *Reactive Mode configures a job so that it always uses all resources >>>>> available in the cluster. Adding a TaskManager will scale up your job, >>>>> removing resources will scale it down. Flink will manage the parallelism >>>>> of >>>>> the job, always setting it to the highest possible values.* >>>>> => Does this mean when I add/remove TaskManager in 'non-reactive' >>>>> mode, resource(CPU/Memory/Etc.) of the cluster is not being changed? >>>>> >>>>> *Reactive Mode restarts a job on a rescaling event, restoring it from >>>>> the latest completed checkpoint. This means that there is no overhead of >>>>> creating a savepoint (which is needed for manually rescaling a job). Also, >>>>> the amount of data that is reprocessed after rescaling depends on the >>>>> checkpointing interval, and the restore time depends on the state size.* >>>>> => As I know 'rescaling' also works in non-reactive mode, with >>>>> restoring checkpoint. What is the difference of using 'reactive' here? >>>>> >>>>> *The Reactive Mode allows Flink users to implement a powerful >>>>> autoscaling mechanism, by having an external service monitor certain >>>>> metrics, such as consumer lag, aggregate CPU utilization, throughput or >>>>> latency. As soon as these metrics are above or below a certain threshold, >>>>> additional TaskManagers can be added or removed from the Flink cluster.* >>>>> => Why is this only possible in 'reactive' mode? Seems this is more >>>>> related to 'autoscaler'. Are there some specific features/API which can >>>>> control TaskManager/Parallelism only in 'reactive' mode? >>>>> >>>>> Thank you. >>>>> >>>>> 2023년 9월 1일 (금) 오후 3:30, Gyula Fóra <gyula.f...@gmail.com>님이 작성: >>>>> >>>>>> The reactive mode reacts to available resources. The autoscaler >>>>>> reacts to changing load and processing capacity and adjusts resources. >>>>>> >>>>>> Completely different concepts and applicability. >>>>>> Most people want the autoscaler , but this is a recent feature and is >>>>>> specific to the k8s operator at the moment. >>>>>> >>>>>> Gyula >>>>>> >>>>>> On Fri, 1 Sep 2023 at 04:50, Dennis Jung <inylov...@gmail.com> wrote: >>>>>> >>>>>>> Hello, >>>>>>> Thanks for your notice. >>>>>>> >>>>>>> Than what is the purpose of using 'reactive', if this doesn't do >>>>>>> anything itself? >>>>>>> What is the difference if I use auto-scaler without 'reactive' mode? >>>>>>> >>>>>>> Regards, >>>>>>> Jung >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2023년 8월 18일 (금) 오후 7:51, Gyula Fóra <gyula.f...@gmail.com>님이 작성: >>>>>>> >>>>>>>> Hi! >>>>>>>> >>>>>>>> I think what you need is probably not the reactive mode but a >>>>>>>> proper autoscaler. The reactive mode as you say doesn't do anything in >>>>>>>> itself, you need to build a lot of logic around it. >>>>>>>> >>>>>>>> Check this instead: >>>>>>>> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/ >>>>>>>> >>>>>>>> The Kubernetes Operator has a built in autoscaler that can scale >>>>>>>> jobs based on kafka data rate / processing throughput. It also doesn't >>>>>>>> rely >>>>>>>> on the reactive mode. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Gyula >>>>>>>> >>>>>>>> On Fri, Aug 18, 2023 at 12:43 PM Dennis Jung <inylov...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> Sorry for frequent questions. This is a question about 'reactive' >>>>>>>>> mode. >>>>>>>>> >>>>>>>>> 1. As far as I understand, though I've setup `scheduler-mode: >>>>>>>>> reactive`, it will not change parallelism automatically by itself, by >>>>>>>>> CPU >>>>>>>>> usage or Kafka consumer rate. It needs additional resource monitor >>>>>>>>> features >>>>>>>>> (such as Horizontal Pod Autoscaler, or else). Is this correct? >>>>>>>>> 2. Is it possible to create a custom resource monitor provider >>>>>>>>> application? For example, if I want to increase/decrease parallelism >>>>>>>>> by >>>>>>>>> Kafka consumer rate, do I need to send specific API from outside, to >>>>>>>>> order >>>>>>>>> rescaling? >>>>>>>>> 3. If 2 is correct, what is the difference when using 'reactive' >>>>>>>>> mode? Because as far as I think, calling a specific API will rescale >>>>>>>>> either >>>>>>>>> using 'reactive' mode or not...(or is the API just working based on >>>>>>>>> this >>>>>>>>> mode)? >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> >>>>>>>>>