[ 
https://issues.apache.org/jira/browse/FLINK-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403533#comment-15403533
 ] 

ramkrishna.s.vasudevan commented on FLINK-4094:
-----------------------------------------------

bq.So, another option to fix this would be to set the MaxDirectMemorySize 
parameter properly.
Yes. I agree. But when the job runs in a multi tenant system where there are 
other process running and they are also memory intensive configuring this may 
always not be easy. I agree it is a direct way to solve the problem if one 
really knows his memory needs and requirements.
Regarding Pooling, some techniques that can be followed ( am saying from the we 
have used it in our projects)
-> Just pool the offheap byte buffers (all are fixed sized buffers). Once the 
usage is over put them back to pool. If the pool is empty we need to wait 
(blocking call - which may not be accepted). So either create onheap buffers 
which may not be right in this use case (but it is ideally safe). Or allocate 
offheap buffers dynamically and warn the user that  his pool size has to be 
increased because he is frequently allocating dynamic offheap buffers. 
-> Another way to avoid segementation could be like Chunking. I can see that by 
default we create 32K sized buffers (page size). Instead we could create say 
2MB sized offheap buffers and keep allocating 32K sized offset on every 
request. Again all the 2MB sized buffers will be pooled but once a buffer is 
requested from the pool we try to allocate 32K offsets. Once a buffer is full 
or the next request cannot be contained in it then move on to the next buffer. 
In turn we can pool these chunks also so that once a chunk is done we put them 
back to  a chunk pool and reuse it once that portion of the chunk is done. But 
this needs some knowledge of when the task has exactly completed the usage of 
that chunk. There should not be any references to it.

> Off heap memory deallocation might not properly work
> ----------------------------------------------------
>
>                 Key: FLINK-4094
>                 URL: https://issues.apache.org/jira/browse/FLINK-4094
>             Project: Flink
>          Issue Type: Bug
>          Components: Local Runtime
>    Affects Versions: 1.1.0
>            Reporter: Till Rohrmann
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 1.1.0
>
>
> A user reported that off-heap memory is not properly deallocated when setting 
> {{taskmanager.memory.preallocate:false}} (per default) [1]. This can cause 
> the TaskManager process being killed by the OS.
> It should be possible to execute multiple batch jobs with preallocation 
> turned off. No longer used direct memory buffers should be properly garbage 
> collected so that the JVM process does not exceed it's maximum memory bounds.
> [1] 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/offheap-memory-allocation-and-memory-leak-bug-td12154.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to