AHeise commented on a change in pull request #13499:
URL: https://github.com/apache/flink/pull/13499#discussion_r496475068



##########
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java
##########
@@ -518,6 +518,11 @@ private void readRecoveredChannelState() throws 
IOException, InterruptedExceptio
                                                        "Cannot restore state 
to a non-checkpointable partition type: " + writer);
                                }
                        }
+
+                       if (!recordWriter.isAvailable()) {
+                               MailboxDefaultAction.Suspension 
suspendedDefaultAction = mailboxProcessor.suspendDefaultAction();
+                               
getInputOutputJointFuture(InputStatus.NOTHING_AVAILABLE).thenRun(suspendedDefaultAction::resume);
+                       }

Review comment:
       I didn't manage to create a unit test, so I will probably add an ITCase. 
I'm extending commit message to state "Currently, task thread blocks if all 
output buffers are taken during recovery: The default action is only suspended 
after calling StreamTask#processInput once, which will block as soon as one 
element is emitted. With this fix, the task thread suspends input processing if 
all output buffers are taken during recovery."
   
   > What if getInputOutputJointFuture returns completed future, but it become 
unavailable during the input recovery?
   
   This is the current behavior: input processing is enabled by default. What 
happens is that the first call to `#processInput` blocks and sets the future 
correctly as soon as one output buffer has been processed. Note that the input 
availability should not be set at this point. It may only happen when the first 
input is recovered.
   
   > Is it actually working?
   
   Yes, but I can only merge the corresponding test after we allow concurrent 
checkpoints or else we run into live locks: Recovery of input channels on a 
non-rescaling case can only happen if `#processInput` is called once because of 
`EndOfChannelStateEvent` being an extra buffer that is only polled when more 
input channels are available.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to