[ https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872271#comment-16872271 ]
Stephan Ewen commented on FLINK-12070: -------------------------------------- I could also quickly turn this branch into a PR, if there is interest in it: https://github.com/StephanEwen/incubator-flink/tree/bounded A quick thought on the errors of the FILE mode: Apparently active partitions in the can get scheduled multiple times netty event loops. Curious if that is a problem in the old "SpillableSubpartition" as well, because the main difference is that the new FILE mode uses one in flight buffer (which should be enough) while the old implementation uses n buffers, meaning it can get be double-enqueued n extra times. I am not sure if that is safe, or just works in practice because we never "double schedule" the partition more than n times. This is not a blocker to advance with the FILE_MMAP mode, but if we want to use the FILE mode, it would be good to understand the exact behavior. > Make blocking result partitions consumable multiple times > --------------------------------------------------------- > > Key: FLINK-12070 > URL: https://issues.apache.org/jira/browse/FLINK-12070 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Affects Versions: 1.9.0 > Reporter: Till Rohrmann > Assignee: Stephan Ewen > Priority: Blocker > Labels: pull-request-available > Fix For: 1.9.0 > > Attachments: image-2019-04-18-17-38-24-949.png > > Time Spent: 20m > Remaining Estimate: 0h > > In order to avoid writing produced results multiple times for multiple > consumers and in order to speed up batch recoveries, we should make the > blocking result partitions to be consumable multiple times. At the moment a > blocking result partition will be released once the consumers has processed > all data. Instead the result partition should be released once the next > blocking result has been produced and all consumers of a blocking result > partition have terminated. Moreover, blocking results should not hold on slot > resources like network buffers or memory as it is currently the case with > {{SpillableSubpartitions}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)