XComp commented on a change in pull request #14847: URL: https://github.com/apache/flink/pull/14847#discussion_r581941734
########## File path: flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/DefaultSchedulerTest.java ########## @@ -541,6 +544,59 @@ public void handleGlobalFailure() { assertThat(deployedExecutionVertices, contains(executionVertexId, executionVertexId)); } + /** + * This test covers the use-case where a global fail-over is followed by a local task failure. + * It verifies (besides checking the expected deployments) that the assert in the global + * recovery handling of {@link SchedulerBase#restoreState} is not triggered due to version + * updates. + */ + @Test + public void handleGlobalFailureWithLocalFailure() { Review comment: We cannot remove [assert in SchedulerBase.restoreState](https://github.com/XComp/flink/blob/7cbd97f815d3bfd4715ddd8cb1d88f92f05a282a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java#L419) as the following operation relies on all vertices being passed for restoring the state. Accepting less `ExecutionJobVertices` would mean that not all tasks would get restored. I added this test to verify the case where a local task failure proceeds a global fail-over before actually restarting the tasks. The local task failure would not trigger a version update because of the [isNotifiable](https://github.com/XComp/flink/blob/7cbd97f815d3bfd4715ddd8cb1d88f92f05a282a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java#L654) check in `SchedulerBase.updateTaskExecutionState`. This will only return `true` (and trigger the internal task state change) if the corresponding `ExecutionVertex` is either in state `FINISHED` or `FAILED`. The previous global fail-over has switched the state already to `CANCELING`. The local task failure will finally switch the state to `CANCELED`. Hence, no internal task state change is processed for the local task failure. No version upgrade is triggered and the global fail-over restart restarts the all vertices as expected. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org