[ https://issues.apache.org/jira/browse/FLINK-33940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801992#comment-17801992 ]
Rui Fan commented on FLINK-33940: --------------------------------- Hi [~Zhanghao Chen] , IIUC flink-benchmarks[1] is checking the performance related to state, and we can see the performance change in this web UI[2]. Such as: the valueGet.HEAP[3][4] means valueState.get for hashmap state backend. !image-2024-01-03-10-52-05-861.png|width=949,height=359! [1] [https://github.com/apache/flink-benchmarks] [2][http://flink-speed.xyz|http://flink-speed.xyz/] [3][https://github.com/apache/flink-benchmarks/blob/master/src/main/java/org/apache/flink/state/benchmark/ValueStateBenchmark.java] [4]http://flink-speed.xyz/timeline/?ben=valueGet.HEAP&env=3 > Update the auto-derivation rule of max parallelism for enlarged upscaling > space > ------------------------------------------------------------------------------- > > Key: FLINK-33940 > URL: https://issues.apache.org/jira/browse/FLINK-33940 > Project: Flink > Issue Type: Improvement > Components: API / Core > Reporter: Zhanghao Chen > Assignee: Zhanghao Chen > Priority: Major > Attachments: image-2024-01-03-10-52-05-861.png > > > *Background* > The choice of the max parallelism of an stateful operator is important as it > limits the upper bound of the parallelism of the opeartor while it can also > add extra overhead when being set too large. Currently, the max parallelism > of an opeartor is either fixed to a value specified by API core / pipeline > option or auto-derived with the following rules: > {{min(max(roundUpToPowerOfTwo(operatorParallelism * 1.5), 128), 32767)}} > *Problem* > Recently, the elasticity of Flink jobs is becoming more and more valued by > users. The current auto-derived max parallelism was introduced a time time > ago and only allows the operator parallelism to be roughly doubled, which is > not desired for elasticity. Setting an max parallelism manually may not be > desired as well: users may not have the sufficient expertise to select a good > max-parallelism value. > *Proposal* > Update the auto-derivation rule of max parallelism to derive larger max > parallelism for better elasticity experience out of the box. A candidate is > as follows: > {{min(max(roundUpToPowerOfTwo(operatorParallelism * {*}5{*}), {*}1024{*}), > 32767)}} > Looking forward to your opinions on this. -- This message was sent by Atlassian Jira (v8.20.10#820010)