[ 
https://issues.apache.org/jira/browse/FLINK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483225#comment-15483225
 ] 

ramkrishna.s.vasudevan commented on FLINK-3322:
-----------------------------------------------

[~ggevay]
I agree with you. But in my test i found that the maximum allocation was coming 
out from the different Iterative tasks allocating the pages in the Sorters. 
Yes the Join Drivers are also calling allocatePages every time. I am not very 
sure if we can make every thing resettable but rather we could allow the task 
to inform the driver that they are iterative tasks and in that case let the 
driver cache the memory pages that it creates for the different join strategies 
and let the close call retain these memory segments at the driver level. So we 
may need to create a constructor in all the Iterators where it could accept a 
list of memorySegment rather than only the memory manager. 
I have any way completed the code to make the Sorters use the memory that it 
created rather than creating it every time. You want to see that PR? I can 
build on top of that where we can ask the iterators to reuse the memory 
segments?



> MemoryManager creates too much GC pressure with iterative jobs
> --------------------------------------------------------------
>
>                 Key: FLINK-3322
>                 URL: https://issues.apache.org/jira/browse/FLINK-3322
>             Project: Flink
>          Issue Type: Bug
>          Components: Local Runtime
>    Affects Versions: 1.0.0
>            Reporter: Gabor Gevay
>            Priority: Critical
>             Fix For: 1.0.0
>
>         Attachments: FLINK-3322.docx
>
>
> When taskmanager.memory.preallocate is false (the default), released memory 
> segments are not added to a pool, but the GC is expected to take care of 
> them. This puts too much pressure on the GC with iterative jobs, where the 
> operators reallocate all memory at every superstep.
> See the following discussion on the mailing list:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html
> Reproducing the issue:
> https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc
> The class to start is malom.Solver. If you increase the memory given to the 
> JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. 
> (It will generate some lookuptables to /tmp on first run for a few minutes.) 
> (I think the slowdown might also depend somewhat on 
> taskmanager.memory.fraction, because more unused non-managed memory results 
> in rarer GCs.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to