That would be great. Best you directly post the link to the JIRA issue. Cheers, Till
On Mon, Jun 20, 2016 at 12:55 PM, CPC <acha...@gmail.com> wrote: > Hi Till, > > I saw jira issue. Do you want me to upload input dataset as well? If you > want i can prepare a github repo if it would be more easier. > On Jun 20, 2016 1:10 PM, "Till Rohrmann" <trohrm...@apache.org> wrote: > > > Hi, > > > > your observation sounds like a bug to me and we have to further > investigate > > it. I assume that you’re running a batch job, right? Could you maybe > share > > your complete configuration and the job to reproduce the problem with us? > > > > I think that your investigation that direct buffers are not properly > freed > > and garbage collected can be right. I will open a JIRA issue to further > > investigate and solve the problem. Thanks for reporting :-) > > > > At the moment, one way to solve this problem is, as you’ve already > stated, > > to set taskmanager.memory.preallocate: true in your configuration. For > > batch jobs, this should actually improve the runtime performance at the > > cost of a slightly longer start-up time for your TaskManagers. > > > > Cheers, > > Till > > > > > > On Sun, Jun 19, 2016 at 6:16 PM, CPC <acha...@gmail.com> wrote: > > > > > Hi, > > > > > > I think i found some information regarding this behavior. In jvm it is > > > almost imposible to free allocated memory via > ByteBuffer.allocateDirect. > > > There is no explicit way to say jvm "free this direct bytebuffer". In > > some > > > forums they said you can free memory with below method: > > > > > >> def releaseBuffers(buffers:List[ByteBuffer]):List[ByteBuffer] = { > > >> > > >> if(!buffers.isEmpty){ > > >> > > >> val cleanerMethod = buffers.head.getClass.getMethod("cleaner") > > >> > > >> cleanerMethod.setAccessible(true) > > >> > > >> buffers.foreach{buffer=> > > >> > > >> val cleaner = cleanerMethod.invoke(buffer) > > >> > > >> val cleanMethod = cleaner.getClass().getMethod("clean") > > >> > > >> cleanMethod.setAccessible(true) > > >> > > >> cleanMethod.invoke(cleaner) > > >> > > >> } > > >> > > >> } > > >> > > >> List.empty[ByteBuffer] > > >> > > >> } > > >> > > >> > > > but since cleaner method is an internal method ,above is not > recommended > > > and not working in every jvm and java 9 does not support it also. I > also > > > made some tests with above method and behavior is not predictable. If > > > memory allocated by some other thread and that thread exit then it > > release > > > memory. Actually GC controls directMemory buffers. If there is no gc > > > activity and memory is allocated and then dereferenced by different > > threads > > > memory usage goes beyond intended and machine goes to swap then os > kills > > > taskmanager. In my tests i saw that behaviour: > > > > > > Suppose that thread A allocated 8gb memory exit and there is no > reference > > > to allocated memory > > > than thread B allocated 8gb memory exit and there is no reference to > > > allocated memory > > > > > > when i look at direct memory usage from jvisualvm it looks like > > > below(-Xmx512m -XX:MaxDirectMemorySize=12G) > > > > > > [image: Inline images 1] > > > > > > but RSS of the process is 16 GB. If i call System.gc at that point RSS > > > drops to 8GB but not to expected point. > > > > > > This is why Apache cassandra guys select sun.misc.Unsafe( > > > > > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Off-heap-caching-through-ByteBuffer-allocateDirect-when-JNA-not-available-td6977711.html > > > ). > > > > > > I think currently only way to limit memory usage in flink if you want > to > > > use same taskmanager across jobs is via > "taskmanager.memory.preallocate: > > > true". Since it allocate memory at the beginning and not freed its > memory > > > usage stays constant. > > > > > > PS: Sorry for my english i am not a native speaker. I hope i can > explain > > > what i intended to :) > > > > > > > > > > > > On 18 June 2016 at 16:36, CPC <acha...@gmail.com> wrote: > > > > > >> Hello, > > >> > > >> I repeated the same test with conf values. > > >> > > >>> taskmanager.heap.mb: 6500 > > >>> > > >>> taskmanager.memory.off-heap: true > > >>> > > >>> taskmanager.memory.fraction: 0.9 > > >>> > > >>> > > >> i set TM_MAX_OFFHEAP_SIZE="6G" in taskmanager sh. Taskmanager started > > >> with > > >> > > >>> capacman 14543 323 56.0 17014744 13731328 pts/1 Sl 16:23 35:25 > > >>> /home/capacman/programlama/java/jdk1.7.0_75/bin/java > > >>> -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -Xms650M > -Xmx650M > > >>> -XX:MaxDirectMemorySize=6G -XX:MaxPermSize=256m > > >>> > > > -Dlog.file=/home/capacman/Data/programlama/flink-1.0.3/log/flink-capacman-taskmanager-0-capacman-Aspire-V3-771.log > > >>> > > > -Dlog4j.configuration=file:/home/capacman/Data/programlama/flink-1.0.3/conf/log4j.properties > > >>> > > > -Dlogback.configurationFile=file:/home/capacman/Data/programlama/flink-1.0.3/conf/logback.xml > > >>> -classpath > > >>> > > > /home/capacman/Data/programlama/flink-1.0.3/lib/flink-dist_2.11-1.0.3.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/flink-python_2.11-1.0.3.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/log4j-1.2.17.jar:/home/capacman/Data/programlama/flink-1.0.3/lib/slf4j-log4j12-1.7.7.jar::: > > >>> org.apache.flink.runtime.taskmanager.TaskManager --configDir > > >>> /home/capacman/Data/programlama/flink-1.0.3/conf > > >>> > > >> > > >> but memory usage reach up to 13Gb. Could somebodey explain me why > memory > > >> usage is so high? I expect it to be at most 8GB with some jvm internal > > >> overhead. > > >> > > >> [image: Inline images 1] > > >> > > >> [image: Inline images 2] > > >> > > >> On 17 June 2016 at 20:26, CPC <acha...@gmail.com> wrote: > > >> > > >>> Hi, > > >>> > > >>> I am making some test about offheap memory usage and encounter an odd > > >>> behavior. My taskmanager heap limit is 12288 Mb and when i set > > >>> "taskmanager.memory.off-hep:true" for every job it allocates 11673 Mb > > off > > >>> heap area at most which is heapsize*0.95(value of > > >>> taskmanager.memory.fraction). But when i submit second job it > allocated > > >>> another 11GB and does not free memory since MaxDirectMemorySize set > to > > >>> -XX:MaxDirectMemorySize=${TM_MAX_OFFHEAP_SIZE}" which is > > >>> TM_MAX_OFFHEAP_SIZE="8388607T" and my laptop goes to swap then kernel > > oom > > >>> killed taskmanager. If i hit perform gc from visualvm between jobs > > then it > > >>> release direct memory but memory usage of taskmanager in ps command > is > > >>> still around 20GB(RSS) and 27GB(virtual size) in that case i could > > submit > > >>> my test job a few times without oom killed task manager but after 10 > > submit > > >>> it killed taskmanager again. I dont understand why jvm memory usage > > is > > >>> still high even if all direct memory released. Do you have any idea? > > Then > > >>> i set MaxDirectMemorySize to 12 GB in this case it freed direct > > memory > > >>> without any explicit gc triggering from visualvm but jvm process > memory > > >>> usage was still high around 20GB(RSS) and 27GB(virtual size). After > > again > > >>> maybe 10 submit it killed taskmanager. I think this is a bug and make > > it > > >>> imposible to reuse taskmanagers without restarting them in standalone > > mode. > > >>> > > >>> [image: Inline images 1] > > >>> > > >>> [image: Inline images 2] > > >>> > > >> > > >> > > > > > >