Hi, Reactive mode and the Autoscaler in Kubernetes operator are two different approaches towards elastic scaling of Flink.
Reactive mode [1] has to be used together with the passive resource management approach of Flink (only Standalone mode takes this approach), where the TM number is controlled by external systems and the job parallelism is derived from the available resources in a reactive manner. The workflow is as follows: External service (like K8s HPA controller) monitors job load and adjust TM number to match the load -> Flink adjust job parallelism reactively. The Autoscaler [2] is to be used with the active resource management approach of Flink (YARN/K8s Application/Session mode), where The TM number is derived from the job resource need, and Flink actively requests/releases TMs to match job need. The workflow for the Autoscaler is: Autoscaler monitors job load and adjusts job parallelism actively to match the load -> Flink adjusts TM number to match the job need. I would recommend using the Autoscaler instead of the Reactive mode for: 1. The Autoscaler provides a all-in-one solution while Reactive mode needs to be paired with an external TM number adjusting system. 2. The Autoscaler has a fine-tuned scaling algorithm tailored for streaming workloads. It will fine-tune each operator's parallelism and take task load balancing in mind while reactive mode just adjust all operator's parallelism uniformly. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#reactive-mode [2] https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/custom-resource/autoscaler/ Best, Zhanghao Chen ________________________________ From: Sachin Sharma <sachinapr...@gmail.com> Sent: Saturday, June 8, 2024 3:02 To: Gyula Fóra <gyula.f...@gmail.com> Cc: Chetas Joshi <chetas.jo...@gmail.com>; Oscar Perez via user <user@flink.apache.org> Subject: Re: Understanding flink-autoscaler behavior Hi, I have a question related to this. I am doing a POC with Kubernetes operator 1.8 and flink 1.18 version with Reactive mode enabled, I added some dummy slow and fast operator to the flink job and i can see there is a back pressure accumulated. but i am not sure why my Flink task managers are not scaled by the operator. Also, can someone explain if autoscalers job is just to add more task manager and then Reactive mode will adjust the parallelism based on the configurations? As per the Operator documentation - Users configure the target utilization percentage of the operators in the pipeline, e.g. keep the all operators between 60% - 80% busy. The autoscaler then finds a parallelism configuration such that the output rates of all operators match the input rates of all their downstream operators at the targeted utilization. So does the autoscaler(Part of kubernetes operator) controls the parallelism or the Reactive mode in the Flink job controls it. Thanks & Regards, Sachin Sharma On Fri, Jun 7, 2024 at 4:55 AM Gyula Fóra <gyula.f...@gmail.com<mailto:gyula.f...@gmail.com>> wrote: Hi! To simplify things you can generally look at TRUE_PROCESSING_RATE, SCALUE_UP_RATE_THRESHOLD and SCALE_DOWN_RATE_THRESHOLD. If TPR is below the scale up threshold then we should scale up and if its above the scale down threshold then we scale down. In your case what we see for your source (cbc357ccb763df2852fee8c4fc7d55f2) as logged is that: TPR: 17498 SCALE_UP_THRESHOLD: 83995 So it should definitely be scaled up in theory, however you can also see: `Updating source cbc357ccb763df2852fee8c4fc7d55f2 max parallelism based on available partitions to 1` This means that the source max parallelism was determined by the available kafka partitions to be 1. It would not make sense to increase the parallelism even though we are clearly falling behind as the source cannot consume a single partition in parallel. Hope this helps Gyula On Fri, Jun 7, 2024 at 3:41 AM Chetas Joshi <chetas.jo...@gmail.com<mailto:chetas.jo...@gmail.com>> wrote: Hi Community, I want to understand the following logs from the flink-k8s-operator autoscaler. My flink pipeline running on 1.18.0 and using flink-k8s-operator (1.8.0) is not scaling up even though the source vertex is back-pressured. 2024-06-06 21:33:35,270 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Updating source cbc357ccb763df2852fee8c4fc7d55f2 max parallelism based on available partitions to 1 2024-06-06 21:33:35,276 o.a.f.a.RestApiMetricsCollector [DEBUG][flink/pipeline-pipelinelocal] Querying metrics {busyTimeMsPerSecond=BUSY_TIME_PER_SEC, Source__dd-log-source.numRecordsOut=SOURCE_TASK_NUM_RECORDS_OUT, Source__dd-log-source.numRecordsIn=SOURCE_TASK_NUM_RECORDS_IN, Source__dd-log-source.pendingRecords=PENDING_RECORDS, Source__dd-log-source.numRecordsInPerSecond=SOURCE_TASK_NUM_RECORDS_IN_PER_SEC, backPressuredTimeMsPerSecond=BACKPRESSURE_TIME_PER_SEC} for cbc357ccb763df2852fee8c4fc7d55f2 2024-06-06 21:33:35,282 o.a.f.a.RestApiMetricsCollector [DEBUG][flink/pipeline-pipelinelocal] Querying metrics {busyTimeMsPerSecond=BUSY_TIME_PER_SEC} for 61214243927da46230dfd349fba7b8e6 2024-06-06 21:33:35,286 o.a.f.a.RestApiMetricsCollector [DEBUG][flink/pipeline-pipelinelocal] Querying metrics {busyTimeMsPerSecond=BUSY_TIME_PER_SEC} for 7758b2b5ada48872db09a5c48176e34e 2024-06-06 21:33:35,291 o.a.f.a.RestApiMetricsCollector [DEBUG][flink/pipeline-pipelinelocal] Querying metrics {busyTimeMsPerSecond=BUSY_TIME_PER_SEC} for eab9c0013081b8479e60463931f3a593 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Calculating vertex scaling metrics for cbc357ccb763df2852fee8c4fc7d55f2 from {BACKPRESSURE_TIME_PER_SEC=AggregatedMetric{id='backPressuredTimeMsPerSecond', mim='0.0', max='0.0', avg='0.0', sum='0.0'}, BUSY_TIME_PER_SEC=AggregatedMetric{id='busyTimeMsPerSecond', mim='192.0', max='192.0', avg='192.0', sum='192.0'}, SOURCE_TASK_NUM_RECORDS_OUT=AggregatedMetric{id='Source__dd-log-source.numRecordsOut', mim='613279.0', max='613279.0', avg='613279.0', sum='613279.0'}, PENDING_RECORDS=AggregatedMetric{id='Source__dd-log-source.pendingRecords', mim='0.0', max='0.0', avg='0.0', sum='0.0'}, SOURCE_TASK_NUM_RECORDS_IN=AggregatedMetric{id='Source__dd-log-source.numRecordsIn', mim='613279.0', max='613279.0', avg='613279.0', sum='613279.0'}, SOURCE_TASK_NUM_RECORDS_IN_PER_SEC=AggregatedMetric{id='Source__dd-log-source.numRecordsInPerSecond', mim='1682.7333333333333', max='1682.7333333333333', avg='1682.7333333333333', sum='1682.7333333333333'}} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Vertex scaling metrics for cbc357ccb763df2852fee8c4fc7d55f2: {ACCUMULATED_BUSY_TIME=32301.0, NUM_RECORDS_OUT=125.0, LOAD=0.192, NUM_RECORDS_IN=613279.0, OBSERVED_TPR=Infinity, LAG=0.0} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Calculating vertex scaling metrics for eab9c0013081b8479e60463931f3a593 from {BUSY_TIME_PER_SEC=AggregatedMetric{id='busyTimeMsPerSecond', mim='0.0', max='0.0', avg='0.0', sum='0.0'}} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Vertex scaling metrics for eab9c0013081b8479e60463931f3a593: {ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=8.0} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Calculating vertex scaling metrics for 61214243927da46230dfd349fba7b8e6 from {BUSY_TIME_PER_SEC=AggregatedMetric{id='busyTimeMsPerSecond', mim='0.0', max='0.0', avg='0.0', sum='0.0'}} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Vertex scaling metrics for 61214243927da46230dfd349fba7b8e6: {ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=8.0} 2024-06-06 21:33:35,304 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Calculating vertex scaling metrics for 7758b2b5ada48872db09a5c48176e34e from {BUSY_TIME_PER_SEC=AggregatedMetric{id='busyTimeMsPerSecond', mim='0.0', max='0.0', avg='0.0', sum='0.0'}} 2024-06-06 21:33:35,305 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Vertex scaling metrics for 7758b2b5ada48872db09a5c48176e34e: {ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=8.0, LOAD=0.0, NUM_RECORDS_IN=117.0} 2024-06-06 21:33:35,305 o.a.f.a.ScalingMetricCollector [DEBUG][flink/pipeline-pipelinelocal] Global metrics: {NUM_TASK_SLOTS_USED=1.0, HEAP_MAX_USAGE_RATIO=0.6800108099126959, HEAP_MEMORY_USED=4.74886648E8, METASPACE_MEMORY_USED=1.40677456E8, MANAGED_MEMORY_USED=0.0} 2024-06-06 21:33:35,306 o.a.f.a.ScalingTracking [DEBUG][flink/pipeline-pipelinelocal] Cannot record restart duration because already set in the latest record: PT0.114185S 2024-06-06 21:33:35,307 o.a.f.a.JobAutoScalerImpl [DEBUG][flink/pipeline-pipelinelocal] Collected metrics: CollectedMetricHistory(jobTopology=JobTopology(vertexInfos={cbc357ccb763df2852fee8c4fc7d55f2=VertexInfo(id=cbc357ccb763df2852fee8c4fc7d55f2, inputs={}, outputs={61214243927da46230dfd349fba7b8e6=REBALANCE, 7758b2b5ada48872db09a5c48176e34e=HASH}, parallelism=1, maxParallelism=1, originalMaxParallelism=20, finished=false, ioMetrics=IOMetrics(numRecordsIn=0, numRecordsOut=125, accumulatedBusyTime=32301.0)), eab9c0013081b8479e60463931f3a593=VertexInfo(id=eab9c0013081b8479e60463931f3a593, inputs={7758b2b5ada48872db09a5c48176e34e=REBALANCE}, outputs={}, parallelism=1, maxParallelism=1, originalMaxParallelism=1, finished=false, ioMetrics=IOMetrics(numRecordsIn=8, numRecordsOut=0, accumulatedBusyTime=0.0)), 61214243927da46230dfd349fba7b8e6=VertexInfo(id=61214243927da46230dfd349fba7b8e6, inputs={cbc357ccb763df2852fee8c4fc7d55f2=REBALANCE}, outputs={}, parallelism=1, maxParallelism=1, originalMaxParallelism=1, finished=false, ioMetrics=IOMetrics(numRecordsIn=8, numRecordsOut=0, accumulatedBusyTime=0.0)), 7758b2b5ada48872db09a5c48176e34e=VertexInfo(id=7758b2b5ada48872db09a5c48176e34e, inputs={cbc357ccb763df2852fee8c4fc7d55f2=HASH}, outputs={eab9c0013081b8479e60463931f3a593=REBALANCE}, parallelism=1, maxParallelism=20, originalMaxParallelism=20, finished=false, ioMetrics=IOMetrics(numRecordsIn=117, numRecordsOut=8, accumulatedBusyTime=0.0))}, finishedVertices=[], verticesInTopologicalOrder=[cbc357ccb763df2852fee8c4fc7d55f2, 61214243927da46230dfd349fba7b8e6, 7758b2b5ada48872db09a5c48176e34e, eab9c0013081b8479e60463931f3a593]), metricHistory={2024-06-06T21:32:35.170678Z=CollectedMetrics(vertexMetrics={cbc357ccb763df2852fee8c4fc7d55f2={ACCUMULATED_BUSY_TIME=20821.0, NUM_RECORDS_OUT=109.0, LOAD=0.0, NUM_RECORDS_IN=512339.0, OBSERVED_TPR=Infinity, LAG=0.0}, eab9c0013081b8479e60463931f3a593={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=7.0}, 61214243927da46230dfd349fba7b8e6={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=7.0}, 7758b2b5ada48872db09a5c48176e34e={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=7.0, LOAD=0.0, NUM_RECORDS_IN=102.0}}, globalMetrics={NUM_TASK_SLOTS_USED=1.0, HEAP_MAX_USAGE_RATIO=0.5849425971687019, HEAP_MEMORY_USED=4.08495608E8, METASPACE_MEMORY_USED=1.43093792E8, MANAGED_MEMORY_USED=0.0}), 2024-06-06T21:33:35.258489Z=CollectedMetrics(vertexMetrics={cbc357ccb763df2852fee8c4fc7d55f2={ACCUMULATED_BUSY_TIME=32301.0, NUM_RECORDS_OUT=125.0, LOAD=0.192, NUM_RECORDS_IN=613279.0, OBSERVED_TPR=Infinity, LAG=0.0}, eab9c0013081b8479e60463931f3a593={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=8.0}, 61214243927da46230dfd349fba7b8e6={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=0.0, LOAD=0.0, NUM_RECORDS_IN=8.0}, 7758b2b5ada48872db09a5c48176e34e={ACCUMULATED_BUSY_TIME=0.0, NUM_RECORDS_OUT=8.0, LOAD=0.0, NUM_RECORDS_IN=117.0}}, globalMetrics={NUM_TASK_SLOTS_USED=1.0, HEAP_MAX_USAGE_RATIO=0.6800108099126959, HEAP_MEMORY_USED=4.74886648E8, METASPACE_MEMORY_USED=1.40677456E8, MANAGED_MEMORY_USED=0.0})}, jobRunningTs=2024-06-06T21:17:35.712Z, fullyCollected=true) 2024-06-06 21:33:35,307 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Restart time used in metrics evaluation: PT5M 2024-06-06 21:33:35,307 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Using busy time based tpr 17498.932104004747 for cbc357ccb763df2852fee8c4fc7d55f2. 2024-06-06 21:33:35,307 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computing edge (cbc357ccb763df2852fee8c4fc7d55f2, 61214243927da46230dfd349fba7b8e6) data rate for single input downstream task 2024-06-06 21:33:35,307 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computed output ratio for edge (cbc357ccb763df2852fee8c4fc7d55f2 -> 61214243927da46230dfd349fba7b8e6) : 9.906875371507826E-6 2024-06-06 21:33:35,308 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computing edge (cbc357ccb763df2852fee8c4fc7d55f2, 7758b2b5ada48872db09a5c48176e34e) data rate for single input downstream task 2024-06-06 21:33:35,308 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computed output ratio for edge (cbc357ccb763df2852fee8c4fc7d55f2 -> 7758b2b5ada48872db09a5c48176e34e) : 1.486031305726174E-4 2024-06-06 21:33:35,308 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computing edge (7758b2b5ada48872db09a5c48176e34e, eab9c0013081b8479e60463931f3a593) data rate for single input downstream task 2024-06-06 21:33:35,308 o.a.f.a.ScalingMetricEvaluator [DEBUG][flink/pipeline-pipelinelocal] Computed output ratio for edge (7758b2b5ada48872db09a5c48176e34e -> eab9c0013081b8479e60463931f3a593) : 0.06666666666666667 2024-06-06 21:33:35,308 o.a.f.a.JobAutoScalerImpl [DEBUG][flink/pipeline-pipelinelocal] Evaluated metrics: EvaluatedMetrics(vertexMetrics={cbc357ccb763df2852fee8c4fc7d55f2={TARGET_DATA_RATE=EvaluatedScalingMetric(current=NaN, average=1679.897), PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), SCALE_UP_RATE_THRESHOLD=EvaluatedScalingMetric(current=83995.0, average=NaN), MAX_PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), TRUE_PROCESSING_RATE=EvaluatedScalingMetric(current=NaN, average=17498.932), LOAD=EvaluatedScalingMetric(current=NaN, average=0.096), SCALE_DOWN_RATE_THRESHOLD=EvaluatedScalingMetric(current=Infinity, average=NaN), CATCH_UP_DATA_RATE=EvaluatedScalingMetric(current=0.0, average=NaN), LAG=EvaluatedScalingMetric(current=0.0, average=NaN)}, eab9c0013081b8479e60463931f3a593={TARGET_DATA_RATE=EvaluatedScalingMetric(current=NaN, average=0.017), PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), SCALE_UP_RATE_THRESHOLD=EvaluatedScalingMetric(current=1.0, average=NaN), MAX_PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), TRUE_PROCESSING_RATE=EvaluatedScalingMetric(current=NaN, average=Infinity), LOAD=EvaluatedScalingMetric(current=NaN, average=0.0), SCALE_DOWN_RATE_THRESHOLD=EvaluatedScalingMetric(current=Infinity, average=NaN), CATCH_UP_DATA_RATE=EvaluatedScalingMetric(current=0.0, average=NaN)}, 61214243927da46230dfd349fba7b8e6={TARGET_DATA_RATE=EvaluatedScalingMetric(current=NaN, average=0.017), PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), SCALE_UP_RATE_THRESHOLD=EvaluatedScalingMetric(current=1.0, average=NaN), MAX_PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), TRUE_PROCESSING_RATE=EvaluatedScalingMetric(current=NaN, average=Infinity), LOAD=EvaluatedScalingMetric(current=NaN, average=0.0), SCALE_DOWN_RATE_THRESHOLD=EvaluatedScalingMetric(current=Infinity, average=NaN), CATCH_UP_DATA_RATE=EvaluatedScalingMetric(current=0.0, average=NaN)}, 7758b2b5ada48872db09a5c48176e34e={TARGET_DATA_RATE=EvaluatedScalingMetric(current=NaN, average=0.25), PARALLELISM=EvaluatedScalingMetric(current=1.0, average=NaN), SCALE_UP_RATE_THRESHOLD=EvaluatedScalingMetric(current=13.0, average=NaN), MAX_PARALLELISM=EvaluatedScalingMetric(current=20.0, average=NaN), TRUE_PROCESSING_RATE=EvaluatedScalingMetric(current=NaN, average=Infinity), LOAD=EvaluatedScalingMetric(current=NaN, average=0.0), SCALE_DOWN_RATE_THRESHOLD=EvaluatedScalingMetric(current=Infinity, average=NaN), CATCH_UP_DATA_RATE=EvaluatedScalingMetric(current=0.0, average=NaN)}}, globalMetrics={HEAP_MAX_USAGE_RATIO=EvaluatedScalingMetric(current=0.68, average=0.632), NUM_TASK_SLOTS_USED=EvaluatedScalingMetric(current=1.0, average=NaN), GC_PRESSURE=EvaluatedScalingMetric(current=NaN, average=NaN), HEAP_MEMORY_USED=EvaluatedScalingMetric(current=4.74886648E8, average=4.41691128E8), METASPACE_MEMORY_USED=EvaluatedScalingMetric(current=1.40677456E8, average=1.41885624E8), MANAGED_MEMORY_USED=EvaluatedScalingMetric(current=0.0, average=0.0)}) 2024-06-06 21:33:35,308 o.a.f.a.ScalingExecutor [DEBUG][flink/pipeline-pipelinelocal] Restart time used in scaling summary computation: PT5M 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Target processing capacity for cbc357ccb763df2852fee8c4fc7d55f2 is 168270.0 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Capped target processing capacity for cbc357ccb763df2852fee8c4fc7d55f2 is 168270.0 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Specified autoscaler maximum parallelism 200 is greater than the operator max parallelism 1. This means the operator max parallelism can never be reached. 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Target processing capacity for eab9c0013081b8479e60463931f3a593 is 2.0 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Computed scale factor of 0.0 for eab9c0013081b8479e60463931f3a593 is capped by maximum scale down factor to 0.4 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Capped target processing capacity for eab9c0013081b8479e60463931f3a593 is Infinity 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Specified autoscaler maximum parallelism 200 is greater than the operator max parallelism 1. This means the operator max parallelism can never be reached. 2024-06-06 21:33:35,308 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Target processing capacity for 61214243927da46230dfd349fba7b8e6 is 2.0 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Computed scale factor of 0.0 for 61214243927da46230dfd349fba7b8e6 is capped by maximum scale down factor to 0.4 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Capped target processing capacity for 61214243927da46230dfd349fba7b8e6 is Infinity 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Specified autoscaler maximum parallelism 200 is greater than the operator max parallelism 1. This means the operator max parallelism can never be reached. 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Target processing capacity for 7758b2b5ada48872db09a5c48176e34e is 25.0 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Computed scale factor of 0.0 for 7758b2b5ada48872db09a5c48176e34e is capped by maximum scale down factor to 0.4 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Capped target processing capacity for 7758b2b5ada48872db09a5c48176e34e is Infinity 2024-06-06 21:33:35,309 o.a.f.a.JobVertexScaler [DEBUG][flink/pipeline-pipelinelocal] Specified autoscaler maximum parallelism 200 is greater than the operator max parallelism 20. This means the operator max parallelism can never be reached. 2024-06-06 21:33:35,309 o.a.f.a.ScalingExecutor [INFO ][flink/pipeline-pipelinelocal] All job vertices are currently running at their target parallelism. Some Questions 1. Does the autoscaler decide to scale a job vertex when the target processing capacity is higher than the current processing capacity? If yes, how to check the current processing capacity? 2. Which metrics in the above logs say the target utilization threshold is not reached and hence all the vertices are running at target parallelism? Thank you Chetas