[ https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825700#comment-16825700 ]
Yingjie Cao edited comment on FLINK-12070 at 4/25/19 4:21 AM: -------------------------------------------------------------- The test is still running and will take a bit more time. Now I'd like to report two bugs which blocked my test. The first one is from release-1.8.0, and I have created a JIRA here https://issues.apache.org/jira/browse/FLINK-12329. The second one is for the mmappartition branch. In the mmappartition branch, a file will be closed when the EOF event is read. However, until then, the data (including previous Buffers) may have not been loaded into memory, because the data, except for the pre-loaded part, is not loaded into memory until accessed by the Netty thread when sending the data out. Add a send-complete callback listener to the EOF event and close the file (release the resource) in the callback function can fix the bug. Judging from the current partial test results, when the data volume is small, the old implementation is a bit faster, and when the data volume is large, the new implementation is faster. However, both Implementations are not fast enough on SATA when both the data volume and the parallelism are large. Because of the bug, the usability of the current FLINK blocking subpartition is poor. Looking forward to the new mmappartition implementation. was (Author: kevin.cyj): The test is still running and will take a bit more time. Now I'd like to report two bugs which blocked my test. The first one is from release-1.8.0, and I have created a JIRA here https://issues.apache.org/jira/browse/FLINK-12329. The second one is for the mmappartition branch. In the mmappartition branch, a file will be closed when the EOF event is read. However, until then, the data (including previous Buffers) may have not been loaded into memory, because the data, except for the pre-loaded part, is not loaded into memory until accessed by the Netty thread when sending the data out. Add a send-complete callback listener to the EOF event and close the file (release the resource) in the callback function can fix the bug. Judging from the current partial test results, when the data volume is small, the old implementation is a bit faster, and when the data volume is large, the new implementation is faster. However, both Implementations are not fast enough on SATA when the data volume is large. Because of the bug, the usability of the current FLINK blocking subpartition is poor. Looking forward to the new mmappartition implementation. > Make blocking result partitions consumable multiple times > --------------------------------------------------------- > > Key: FLINK-12070 > URL: https://issues.apache.org/jira/browse/FLINK-12070 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Reporter: Till Rohrmann > Assignee: BoWang > Priority: Major > Attachments: image-2019-04-18-17-38-24-949.png > > > In order to avoid writing produced results multiple times for multiple > consumers and in order to speed up batch recoveries, we should make the > blocking result partitions to be consumable multiple times. At the moment a > blocking result partition will be released once the consumers has processed > all data. Instead the result partition should be released once the next > blocking result has been produced and all consumers of a blocking result > partition have terminated. Moreover, blocking results should not hold on slot > resources like network buffers or memory as it is currently the case with > {{SpillableSubpartitions}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)