[ https://issues.apache.org/jira/browse/FLINK-20605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249624#comment-17249624 ]
Till Rohrmann commented on FLINK-20605: --------------------------------------- I guess for the first problem we need to check whether it is still valid before processing the {{handleAsync}} callback. For the second problem we might miss a check whether we are shut down or not. For the third problem we either tolerate duplicate status updates or need to enforce on the sender side that it is only sent once. In general, the former approach should be more robust. > DeclarativeSlotManager crashes if slot allocation notification is processed > after taskexecutor shutdown > ------------------------------------------------------------------------------------------------------- > > Key: FLINK-20605 > URL: https://issues.apache.org/jira/browse/FLINK-20605 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.13.0 > Reporter: Chesnay Schepler > Assignee: Chesnay Schepler > Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > It is possible that a notification from a task executor about a slot being > allocated can be processed after that very task executor has unregistered > itself from the resource manager. > As a result we run into an exception when trying to mark this slot as > allocated, because it no longer exists and a precondition catches this case. > We could solve this by checking in > {{DeclarativeResourceManager#allocateSlot}} whether the task executor we > received the acknowledge from is still registered. -- This message was sent by Atlassian Jira (v8.3.4#803005)