[ 
https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855659#comment-16855659
 ] 

Piotr Nowojski commented on FLINK-12070:
----------------------------------------

{quote}As for the reason why the machine froze up, I guess it is because that 
flushing mmaped region to disk also need memory while no enough pages left.
{quote}
That would be my guess as well, however it might be though to confirm directly. 
What [~StephanEwen] proposed as fixups
{quote}1. Directly write to a file and mmap the file, rather than writing to 
mmapped region. That way the data should be eagerly persisted, i.e., there is 
no I/O needed when memory paged are evicted.

2. Directly write to file and directly read from file.
{quote}
Should solve the issue, since if kernel runs out of memory in that case, it 
should be able to immediately drop mmaped pages.

> Make blocking result partitions consumable multiple times
> ---------------------------------------------------------
>
>                 Key: FLINK-12070
>                 URL: https://issues.apache.org/jira/browse/FLINK-12070
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.9.0
>            Reporter: Till Rohrmann
>            Assignee: Stephan Ewen
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.9.0
>
>         Attachments: image-2019-04-18-17-38-24-949.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to avoid writing produced results multiple times for multiple 
> consumers and in order to speed up batch recoveries, we should make the 
> blocking result partitions to be consumable multiple times. At the moment a 
> blocking result partition will be released once the consumers has processed 
> all data. Instead the result partition should be released once the next 
> blocking result has been produced and all consumers of a blocking result 
> partition have terminated. Moreover, blocking results should not hold on slot 
> resources like network buffers or memory as it is currently the case with 
> {{SpillableSubpartitions}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to