yuanfenghu created FLINK-35823: ---------------------------------- Summary: Introduce parameters to control the upper limit of rescale to avoid unlimited shrinkage due to server-side bottlenecks or data skew. Key: FLINK-35823 URL: https://issues.apache.org/jira/browse/FLINK-35823 Project: Flink Issue Type: Improvement Components: Autoscaler Reporter: yuanfenghu Fix For: 2.0.0
1. If a Flink application writes data to other external storage systems, such as HDFS, Kafka, etc., when the external server becomes the bottleneck of the entire task, such as the throughput of HDFS decreases, the writing IO time will increase, and the corresponding Flink The metric busy will also increase. At this time, the autoscaler will determine that the parallelism needs to be increased to increase the write rate. However, in the above case, due to the bottleneck of the external server, this will not work. This will cause the next determination cycle to continue to increase the parallelism until parallelism = max-parallelism. 2. If some tasks have data skew, it will also cause the same problem. Therefore, we should introduce a new parameter judgment. If the degree of parallelism continues to increase, the throughput will basically remain the same. There is no need to expand anymore. -- This message was sent by Atlassian Jira (v8.20.10#820010)