[ 
https://issues.apache.org/jira/browse/FLINK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444925#comment-17444925
 ] 

ming li edited comment on FLINK-24815 at 11/17/21, 3:43 AM:
------------------------------------------------------------

[~pnowojski], um. . . you are right.  Although the size of the original 
{{OperatorSubtaskState}} can be passed to the builder, it will not work when 
the parallelism changes or there is a union type state.

Therefore, I prefer to update the {{stateSize}} when calling the 
{{getStateSize}} method, or we pass a magic number to the builder as a 
placeholder (it seems that the {{stateSize}} is not used during the job 
recovery).


was (Author: ming li):
[~pnowojski], um. . . you are right.  Although the size of the original 
{{OperatorSubtaskState}} can be passed to the builder, it will not work when 
the parallelism changes or there is a union type state.

Therefore, I prefer to update the {{stateSize}} when calling the 
{{getStateSize}} method, or here we pass a magic number as a placeholder (it 
seems that the {{stateSize}} is not used during the job recovery).

> Reduce the cpu cost of calculating stateSize during state allocation
> --------------------------------------------------------------------
>
>                 Key: FLINK-24815
>                 URL: https://issues.apache.org/jira/browse/FLINK-24815
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.14.0
>            Reporter: ming li
>            Priority: Major
>
> When the task failover, we will reassign the state for each subtask and 
> create a new {{OperatorSubtaskState}} object. At this time, the {{stateSize}} 
> field in the {{OperatorSubtaskState}} will be recalculated. When using 
> incremental {{{}Checkpoint{}}}, this field needs to traverse all shared 
> states and then accumulate the size of the state.
> Taking a job with 2000 parallelism and 100 share state for each task as an 
> example, it needs to traverse 2000 * 100 = 20w times. At this time, the cpu 
> of the JM scheduling thread will be full.
> I think we can try to provide a construction method with {{stateSize}} for 
> {{OperatorSubtaskState}} or delay the calculation of {{{}stateSize{}}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to