[ https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818682#comment-16818682 ]
ryantaocer commented on FLINK-12070: ------------------------------------ It's so cool to unify the network buffered and spilled partitions! [~StephanEwen] There would be some comments: # small file: in general, memory mapping besed IO can be very efficient for large file. But just as you have mentioned, we need to test its performance for those small files. The data size of each result partition that is produced for each downstream task in production cluster is uauslly up to about 50MB. # multipe reads: the ByteBuffer object returned by the FileChannel.map() has an internal position, and it cannot be shared among simultaneous reads like speculative execution. In other cases, it can be reset to the begin for subsequent reads. > Make blocking result partitions consumable multiple times > --------------------------------------------------------- > > Key: FLINK-12070 > URL: https://issues.apache.org/jira/browse/FLINK-12070 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Reporter: Till Rohrmann > Assignee: BoWang > Priority: Major > > In order to avoid writing produced results multiple times for multiple > consumers and in order to speed up batch recoveries, we should make the > blocking result partitions to be consumable multiple times. At the moment a > blocking result partition will be released once the consumers has processed > all data. Instead the result partition should be released once the next > blocking result has been produced and all consumers of a blocking result > partition have terminated. Moreover, blocking results should not hold on slot > resources like network buffers or memory as it is currently the case with > {{SpillableSubpartitions}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)