[ https://issues.apache.org/jira/browse/FLINK-31245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weijie Guo updated FLINK-31245: ------------------------------- Fix Version/s: 2.1.0 (was: 2.0.0) > Adaptive scheduler does not reset the state of GlobalAggregateManager when > rescaling > ------------------------------------------------------------------------------------ > > Key: FLINK-31245 > URL: https://issues.apache.org/jira/browse/FLINK-31245 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 2.1.0 > Reporter: Zhanghao Chen > Priority: Major > Fix For: 2.1.0 > > > *Problem* > GlobalAggregateManager is used to share state amongst parallel tasks in a job > and thus coordinate their execution. It maintains a state (the _accumulators_ > field in JobMaster) in JM memory. The accumulator state content is defined in > user code, in my company, a user stores task parallelism in the accumulator, > assuming task parallelism never changes. However, this assumption is broken > when using adaptive scheduler. > *Possible Solutions* > # Mark GlobalAggregateManager as deprecated. It seems that operator > coordinator can completely replace GlobalAggregateManager and is a more > elegent solution. Therefore, it is fine to deprecate GlobalAggregateManager > and leave this issue there. If that's the case, we can open another ticket > for doing that. > # If we decide to continue supporting GlobalAggregateManager, then we need > to reset the state when rescaling. -- This message was sent by Atlassian Jira (v8.20.10#820010)