[ https://issues.apache.org/jira/browse/FLINK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444925#comment-17444925 ]
ming li edited comment on FLINK-24815 at 11/17/21, 3:43 AM: ------------------------------------------------------------ [~pnowojski], um. . . you are right. Although the size of the original {{OperatorSubtaskState}} can be passed to the builder, it will not work when the parallelism changes or there is a union type state. Therefore, I prefer to update the {{stateSize}} when calling the {{getStateSize}} method, or we pass a magic number to the builder as a placeholder (it seems that the {{stateSize}} is not used during the job recovery). was (Author: ming li): [~pnowojski], um. . . you are right. Although the size of the original {{OperatorSubtaskState}} can be passed to the builder, it will not work when the parallelism changes or there is a union type state. Therefore, I prefer to update the {{stateSize}} when calling the {{getStateSize}} method, or here we pass a magic number as a placeholder (it seems that the {{stateSize}} is not used during the job recovery). > Reduce the cpu cost of calculating stateSize during state allocation > -------------------------------------------------------------------- > > Key: FLINK-24815 > URL: https://issues.apache.org/jira/browse/FLINK-24815 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing > Affects Versions: 1.14.0 > Reporter: ming li > Priority: Major > > When the task failover, we will reassign the state for each subtask and > create a new {{OperatorSubtaskState}} object. At this time, the {{stateSize}} > field in the {{OperatorSubtaskState}} will be recalculated. When using > incremental {{{}Checkpoint{}}}, this field needs to traverse all shared > states and then accumulate the size of the state. > Taking a job with 2000 parallelism and 100 share state for each task as an > example, it needs to traverse 2000 * 100 = 20w times. At this time, the cpu > of the JM scheduling thread will be full. > I think we can try to provide a construction method with {{stateSize}} for > {{OperatorSubtaskState}} or delay the calculation of {{{}stateSize{}}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)