zhijiangW commented on a change in pull request #7911: [FLINK-11082][network] Fix the logic of getting backlog in sub partition URL: https://github.com/apache/flink/pull/7911#discussion_r264508671
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/ResultSubpartition.java ########## @@ -116,52 +115,58 @@ protected Throwable getFailureCause() { public abstract boolean isReleased(); - /** - * Gets the number of non-event buffers in this subpartition. - * - * <p><strong>Beware:</strong> This method should only be used in tests in non-concurrent access - * scenarios since it does not make any concurrency guarantees. - */ - @VisibleForTesting - public int getBuffersInBacklog() { - return buffersInBacklog; - } - /** * Makes a best effort to get the current size of the queue. * This method must not acquire locks or interfere with the task and network threads in * any way. */ public abstract int unsynchronizedGetNumberOfQueuedBuffers(); + /** + * Gets the number of non-event buffers in this subpartition. + */ + public abstract int getBuffersInBacklog(); + + /** + * @param lastBufferAvailable whether the last buffer in this subpartition is available for consumption + * @return the number of non-event buffers in this subpartition + */ + protected int getBuffersInBacklog(boolean lastBufferAvailable) { Review comment: That is very good question. We define the variable `buffersInBacklog` for counting the non-event buffers in the queue. `getNumberOfFinishedBuffers` and `isAvailableUnsafe` are counting both event and non-event buffers. So they can not be unified directly. I even tried another way by counting only event buffers instead of current `buffersInBacklog`, but it still does not simplify the logic. Especially for `SpillableSubpartition` we could not get total buffers directly from the memory queue, then we still need non-event buffer counting even though we count the event buffers. I also find that the current conditions seem a bit complicated to maintain in different processes, such as `shouldNotifyDataAvailable`, `isAvailable`, `getBuffersInBacklog`, etc. And it would be better if we can integrate something to make related conditions unification. After we have a good idea for this issue, I would like to make changes. :) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services