Kyungmin Kim created FLINK-31898:
------------------------------------

             Summary: Flink k8s autoscaler does not work as expected
                 Key: FLINK-31898
                 URL: https://issues.apache.org/jira/browse/FLINK-31898
             Project: Flink
          Issue Type: Improvement
            Reporter: Kyungmin Kim
         Attachments: image-2023-04-24-10-54-58-083.png

Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper 
parallelism.

I was using 1.4 version but I found that it does not scale down properly 
because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.

In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE to 
positive infinity and make scaleFactor to very low value so I'm now 
experimentally using docker image built with main branch of Flink-k8s-operator 
repository in my job.

It now scales down properly but the problem is, it does not converge to the 
optimal parallelism. It scales down well but it jumps up again to high 
parallelism. 

 

Below is the experimental setup and my figure of parallelism changes result.
 * about 40 RPS
 * each task can process 10 TPS (intended throttling)

!image-2023-04-24-10-54-58-083.png!

Even using default configuration leads to the same result. What can I do more? 
Thank you.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to