[jira] [Commented] (FLINK-33940) Update the auto-derivation rule of max parallelism for enlarged upscaling space

Rui Fan (Jira) Mon, 25 Dec 2023 23:31:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-33940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800430#comment-17800430
 ]


Rui Fan commented on FLINK-33940:
---------------------------------

Thanks [~Zhanghao Chen] for driving this improvement and for the ping!:)

1024 as the min value of max parallelism makes sense to me, and our internal 
flink version also uses 1024 instead of 128. And it's fine for most of jobs.

IIUC, when the parallelism of one job is very small(it's 1 or 2) and the max 
parallelism is 1024, one subtask will have 1024 keyGroups. From state backend 
side, too many key groups may effect the performance. (This is my concern to 
change it by default in Flink Community.)

Note: this performance drop may be insignificant in a real production 
environment.

> Update the auto-derivation rule of max parallelism for enlarged upscaling 
> space
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-33940
>                 URL: https://issues.apache.org/jira/browse/FLINK-33940
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Core
>            Reporter: Zhanghao Chen
>            Priority: Major
>
> *Background*
> The choice of the max parallelism of an stateful operator is important as it 
> limits the upper bound of the parallelism of the opeartor while it can also 
> add extra overhead when being set too large. Currently, the max parallelism 
> of an opeartor is either fixed to a value specified by API core / pipeline 
> option or auto-derived with the following rules:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * 1.5), 128), 32767)}}
> *Problem*
> Recently, the elasticity of Flink jobs is becoming more and more valued by 
> users. The current auto-derived max parallelism was introduced a time time 
> ago and only allows the operator parallelism to be roughly doubled, which is 
> not desired for elasticity. Setting an max parallelism manually may not be 
> desired as well: users may not have the sufficient expertise to select a good 
> max-parallelism value.
> *Proposal*
> Update the auto-derivation rule of max parallelism to derive larger max 
> parallelism for better elasticity experience out of the box. A candidate is 
> as follows:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * {*}5{*}), {*}1024{*}), 
> 32767)}}
> Looking forward to your opinions on this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-33940) Update the auto-derivation rule of max parallelism for enlarged upscaling space

Reply via email to