zhijiangW commented on a change in pull request #11687: URL: https://github.com/apache/flink/pull/11687#discussion_r420167190
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/LocalInputChannel.java ########## @@ -168,6 +175,12 @@ public void run() { Optional<BufferAndAvailability> getNextBuffer() throws IOException, InterruptedException { checkError(); + BufferAndAvailability bufferAndAvailability = getNextRecoveredStateBuffer(); + if (bufferAndAvailability != null) { Review comment: I know that the current data notification from downstream side is not very accurate which might cause return `null` buffer sometimes, but we can make the channel state notification accurate (actually it is the ground truth now). Then another truth is that the channel state consumption should always happen before downstream consumption, so it should be no problem here. As long as `getNextBuffer` is triggered by channel state buffer, it can always get accurate buffer. Otherwise if it is triggered by downstream side notification, then it also confirms that the channel state recovery should already finish before. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org