Hi Till, Thanks for the response! The metrics I got from cadvisor and visualized via dashboard shipped by kubernetes. I actually run the flink job for the past 2 weeks and the memory usage has been stabilized. There is no issue so far. I still could not figure out the mystery why it was trending up initially.
Thanks a lot for the help! Eleanoree On Fri, Nov 13, 2020 at 7:01 AM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Eleanore, > > sorry for my late reply. The heap dump you have sent does not look > problematic. How do you measure the pod memory usage exactly? If you start > the Flink process with -Xms5120m -Xmx5120m then Flink should allocate 5120 > MB of heap memory. Hence, this should be exactly what you are seeing in > your memory usage graph. This should actually happen independent of the > checkpointing. > > Maybe you can also share the debug logs with us. Maybe they contain some > more information. > > Cheers, > Till > > On Sat, Oct 24, 2020 at 12:24 AM Eleanore Jin <eleanore....@gmail.com> > wrote: > >> I also tried enable native memory tracking, via jcmd, here is the memory >> breakdown: https://ibb.co/ssrZB4F >> >> since job manager memory configuration for flink 1.10.2 only has >> jobmanager.heap.size, and it only translates to heap settings, should I >> also set -XX:MaxDirectMemorySize and -XX:MaxMetaspaceSize for job >> manager? And any recommendations? >> >> Thanks a lot! >> Eleanore >> >> On Fri, Oct 23, 2020 at 9:28 AM Eleanore Jin <eleanore....@gmail.com> >> wrote: >> >>> Hi Till, >>> >>> please see the screenshot of heap dump: https://ibb.co/92Hzrpr >>> >>> Thanks! >>> Eleanore >>> >>> On Fri, Oct 23, 2020 at 9:25 AM Eleanore Jin <eleanore....@gmail.com> >>> wrote: >>> >>>> Hi Till, >>>> Thanks a lot for the prompt response, please see below information. >>>> >>>> 1. how much memory assign to JM pod? >>>> 6g for container memory limit, 5g for jobmanager.heap.size, I think >>>> this is the only available jm memory configuration for flink 1.10.2 >>>> >>>> 2. Have you tried with newer Flink versions? >>>> I am actually using Apache Beam, so the latest version they support for >>>> Flink is 1.10 >>>> >>>> 3. What statebackend is used? >>>> FsStateBackend, and the checkpoint size is around 12MB from checkpoint >>>> metrics, so I think it is not get inlined >>>> >>>> 4. What is state.checkpoints.num-retained? >>>> I did not configure this explicitly, so by default only 1 should be >>>> retained >>>> >>>> 5. Anything suspicious from JM log? >>>> There is no Exception nor Error, the only thing I see is the below logs >>>> keeps on repeating >>>> >>>> {"@timestamp":"2020-10-23T16:05:20.350Z","@version":"1","message":"Disabling >>>> threads for Delete operation as thread count 0 is <= >>>> 1","logger_name":"org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azure.AzureFileSystemThreadPoolExecutor","thread_name":"jobmanager-future-thread-4","level":"WARN","level_value":30000} >>>> >>>> 6. JVM args obtained vis jcmd >>>> >>>> -Xms5120m -Xmx5120m -XX:MaxGCPauseMillis=20 >>>> -XX:-OmitStackTraceInFastThrow >>>> >>>> >>>> 7. Heap info returned by jcmd <pid> GC.heap_info >>>> >>>> it suggested only about 1G of the heap is used >>>> >>>> garbage-first heap total 5242880K, used 1123073K >>>> [0x00000006c0000000, 0x0000000800000000) >>>> >>>> region size 2048K, 117 young (239616K), 15 survivors (30720K) >>>> >>>> Metaspace used 108072K, capacity 110544K, committed 110720K, >>>> reserved 1146880K >>>> >>>> class space used 12963K, capacity 13875K, committed 13952K, >>>> reserved 1048576K >>>> >>>> >>>> 8. top -p <pid> >>>> >>>> it suggested for flink job manager java process 4.8G of physical memory >>>> is consumed >>>> >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>>> COMMAND >>>> >>>> >>>> 1 root 20 0 13.356g 4.802g 22676 S 6.0 7.6 37:48.62 >>>> java >>>> >>>> >>>> >>>> Thanks a lot! >>>> Eleanore >>>> >>>> >>>> On Fri, Oct 23, 2020 at 4:19 AM Till Rohrmann <trohrm...@apache.org> >>>> wrote: >>>> >>>>> Hi Eleanore, >>>>> >>>>> how much memory did you assign to the JM pod? Maybe the limit is so >>>>> high that it takes a bit of time until GC is triggered. Have you tried >>>>> whether the same problem also occurs with newer Flink versions? >>>>> >>>>> The difference between checkpoints enabled and disabled is that the JM >>>>> needs to do a bit more bookkeeping in order to track the completed >>>>> checkpoints. If you are using the HeapStateBackend, then all states >>>>> smaller >>>>> than state.backend.fs.memory-threshold will get inlined, meaning that they >>>>> are sent to the JM and stored in the checkpoint meta file. This can >>>>> increase the memory usage of the JM process. Depending on >>>>> state.checkpoints.num-retained this can grow as large as number retained >>>>> checkpoints times the checkpoint size. However, I doubt that this adds up >>>>> to several GB of additional space. >>>>> >>>>> In order to better understand the problem, the debug logs of your JM >>>>> could be helpful. Also a heap dump might be able to point us towards the >>>>> component which is eating up so much memory. >>>>> >>>>> Cheers, >>>>> Till >>>>> >>>>> On Thu, Oct 22, 2020 at 4:56 AM Eleanore Jin <eleanore....@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I have a flink job running version 1.10.2, it simply read from a >>>>>> kafka topic with 96 partitions, and output to another kafka topic. >>>>>> >>>>>> It is running in k8s, with 1 JM (not in HA mode), 12 task managers >>>>>> each has 4 slots. >>>>>> The checkpoint persists the snapshot to azure blob storage, >>>>>> checkpoints interval every 3 seconds, with 10 seconds timeout and minimum >>>>>> pause of 1 second. >>>>>> >>>>>> I observed that the job manager pod memory usage grows over time, any >>>>>> hints on why this is the case? And the memory usage for JM is >>>>>> significantly >>>>>> more compared to no checkpoint enabled. >>>>>> [image: image.png] >>>>>> >>>>>> Thanks a lot! >>>>>> Eleanore >>>>>> >>>>>