[ https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818619#comment-16818619 ]
zhijiang edited comment on FLINK-12070 at 4/16/19 4:25 AM: ----------------------------------------------------------- Thanks for providing the POC for this feature [~StephanEwen] I really approve the ways of making {{spillableSubpartition}} simple write to persistent file directly and make use of OS cache efficiently via mmap. Although the previous behavior of intermediate memory state for {{SpillableSubpartition}} could bring performance benefits in small-scale data, it makes the logic more complex to maintain, even other tasks might be impacted when it evicts to disk file to release memory segment. And the partition data is not determined considering the FLIP-1 requirements. These points are also my previous concerns and planning to be focused. Considering the issue of caching intermediate results, we ever implemented the similar function in previous Blink version, but finally it is abandoned for some reasons. Actually this issue is valuable in production, because it can both satisfy the speed of memory shuffle like streaming pipelined way and also consider the failover requirements via asynchronous eviction to disk. The mmappartition might bring more possibilities for this issue. BTW, what is your plan for making this POC into practice? Do you need hands for supplementing the unit tests and verifying benchmarks? The partition release issue after consumption could be considered together in the proposal of partition lifecycle management. was (Author: zjwang): Thanks for providing the POC for this feature [~StephanEwen] I really approve the ways of making {{spillableSubpartition}} simple write to persistent file directly and make use of OS cache efficiently via mmap. Although the previous behavior of intermediate memory state for {{SpillableSubpartition}} could bring performance benefits in small-scale data, it makes the logic more complex to maintain, even other tasks might be impacted when it evicts to disk file to release memory segment. And the partition data is not determined considering the FLIP-1 requirements. These points are also my previous concerns and planning to be focused. Considering the issue of caching intermediate results, we ever implemented the similar function in previous Blink version, but finally it is abandoned for some reasons. Actually this issue is valuable in production, because it can both satisfy the speed of memory shuffle like streaming pipelined way and also consider the failover requirements via asynchronous eviction to disk. The mmappartition might bring more possibilities for this issue. BTW, what is your plan for making this POC into practice? Do you need hands for supplementing the unit tests and verifying benchmarks? The partition release issue after consumption could be considered together in the proposal of partition lifecycle management before. > Make blocking result partitions consumable multiple times > --------------------------------------------------------- > > Key: FLINK-12070 > URL: https://issues.apache.org/jira/browse/FLINK-12070 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Reporter: Till Rohrmann > Assignee: BoWang > Priority: Major > > In order to avoid writing produced results multiple times for multiple > consumers and in order to speed up batch recoveries, we should make the > blocking result partitions to be consumable multiple times. At the moment a > blocking result partition will be released once the consumers has processed > all data. Instead the result partition should be released once the next > blocking result has been produced and all consumers of a blocking result > partition have terminated. Moreover, blocking results should not hold on slot > resources like network buffers or memory as it is currently the case with > {{SpillableSubpartitions}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)