pnowojski commented on code in PR #26861:
URL: https://github.com/apache/flink/pull/26861#discussion_r2281795478


##########
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java:
##########
@@ -885,6 +850,53 @@ else if (transitionState(current, ExecutionState.FAILED, 
t)) {
         }
     }
 
+    /**
+     * Transition into our final state in case of failure. We should be either 
in DEPLOYING,
+     * INITIALIZING, RUNNING, CANCELING, or FAILED loop for multiple retries 
during concurrent state
+     * changes via calls to cancel() or to failExternally()
+     */
+    private void transitionStateOnFailure(
+            Throwable t, AutoCloseableRegistry postFailureCleanUpRegistry) 
throws IOException {
+        while (true) {
+            ExecutionState current = this.executionState;

Review Comment:
   1. I would prefer avoid touching this, as this is pre-existing logic that 
I'm just extracting to a separate method, and I would like to minimise changes 
for this already pretty fragile bug fix (I've been struggling to fix all of the 
failing e2e/itcases for quite some time).
   2. This can't loop for more then one or maybe a couple of times. Task can't 
keep changing it's state for a long period of time. At worst this will just 
make two or three iterations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to