[ https://issues.apache.org/jira/browse/FLINK-33940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800435#comment-17800435 ]
Rui Fan commented on FLINK-33940: --------------------------------- Thanks [~gyfora] for the reminder! Too many divisors is useful for avoid data skew, 840 makes sense to me. (y) > Update the auto-derivation rule of max parallelism for enlarged upscaling > space > ------------------------------------------------------------------------------- > > Key: FLINK-33940 > URL: https://issues.apache.org/jira/browse/FLINK-33940 > Project: Flink > Issue Type: Improvement > Components: API / Core > Reporter: Zhanghao Chen > Assignee: Zhanghao Chen > Priority: Major > > *Background* > The choice of the max parallelism of an stateful operator is important as it > limits the upper bound of the parallelism of the opeartor while it can also > add extra overhead when being set too large. Currently, the max parallelism > of an opeartor is either fixed to a value specified by API core / pipeline > option or auto-derived with the following rules: > {{min(max(roundUpToPowerOfTwo(operatorParallelism * 1.5), 128), 32767)}} > *Problem* > Recently, the elasticity of Flink jobs is becoming more and more valued by > users. The current auto-derived max parallelism was introduced a time time > ago and only allows the operator parallelism to be roughly doubled, which is > not desired for elasticity. Setting an max parallelism manually may not be > desired as well: users may not have the sufficient expertise to select a good > max-parallelism value. > *Proposal* > Update the auto-derivation rule of max parallelism to derive larger max > parallelism for better elasticity experience out of the box. A candidate is > as follows: > {{min(max(roundUpToPowerOfTwo(operatorParallelism * {*}5{*}), {*}1024{*}), > 32767)}} > Looking forward to your opinions on this. -- This message was sent by Atlassian Jira (v8.20.10#820010)