I also tried enable native memory tracking, via jcmd, here is the memory breakdown: https://ibb.co/ssrZB4F
since job manager memory configuration for flink 1.10.2 only has jobmanager.heap.size, and it only translates to heap settings, should I also set -XX:MaxDirectMemorySize and -XX:MaxMetaspaceSize for job manager? And any recommendations? Thanks a lot! Eleanore On Fri, Oct 23, 2020 at 9:28 AM Eleanore Jin <eleanore....@gmail.com> wrote: > Hi Till, > > please see the screenshot of heap dump: https://ibb.co/92Hzrpr > > Thanks! > Eleanore > > On Fri, Oct 23, 2020 at 9:25 AM Eleanore Jin <eleanore....@gmail.com> > wrote: > >> Hi Till, >> Thanks a lot for the prompt response, please see below information. >> >> 1. how much memory assign to JM pod? >> 6g for container memory limit, 5g for jobmanager.heap.size, I think this >> is the only available jm memory configuration for flink 1.10.2 >> >> 2. Have you tried with newer Flink versions? >> I am actually using Apache Beam, so the latest version they support for >> Flink is 1.10 >> >> 3. What statebackend is used? >> FsStateBackend, and the checkpoint size is around 12MB from checkpoint >> metrics, so I think it is not get inlined >> >> 4. What is state.checkpoints.num-retained? >> I did not configure this explicitly, so by default only 1 should be >> retained >> >> 5. Anything suspicious from JM log? >> There is no Exception nor Error, the only thing I see is the below logs >> keeps on repeating >> >> {"@timestamp":"2020-10-23T16:05:20.350Z","@version":"1","message":"Disabling >> threads for Delete operation as thread count 0 is <= >> 1","logger_name":"org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azure.AzureFileSystemThreadPoolExecutor","thread_name":"jobmanager-future-thread-4","level":"WARN","level_value":30000} >> >> 6. JVM args obtained vis jcmd >> >> -Xms5120m -Xmx5120m -XX:MaxGCPauseMillis=20 -XX:-OmitStackTraceInFastThrow >> >> >> 7. Heap info returned by jcmd <pid> GC.heap_info >> >> it suggested only about 1G of the heap is used >> >> garbage-first heap total 5242880K, used 1123073K [0x00000006c0000000, >> 0x0000000800000000) >> >> region size 2048K, 117 young (239616K), 15 survivors (30720K) >> >> Metaspace used 108072K, capacity 110544K, committed 110720K, >> reserved 1146880K >> >> class space used 12963K, capacity 13875K, committed 13952K, >> reserved 1048576K >> >> >> 8. top -p <pid> >> >> it suggested for flink job manager java process 4.8G of physical memory >> is consumed >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> >> >> >> 1 root 20 0 13.356g 4.802g 22676 S 6.0 7.6 37:48.62 java >> >> >> >> >> Thanks a lot! >> Eleanore >> >> >> On Fri, Oct 23, 2020 at 4:19 AM Till Rohrmann <trohrm...@apache.org> >> wrote: >> >>> Hi Eleanore, >>> >>> how much memory did you assign to the JM pod? Maybe the limit is so high >>> that it takes a bit of time until GC is triggered. Have you tried whether >>> the same problem also occurs with newer Flink versions? >>> >>> The difference between checkpoints enabled and disabled is that the JM >>> needs to do a bit more bookkeeping in order to track the completed >>> checkpoints. If you are using the HeapStateBackend, then all states smaller >>> than state.backend.fs.memory-threshold will get inlined, meaning that they >>> are sent to the JM and stored in the checkpoint meta file. This can >>> increase the memory usage of the JM process. Depending on >>> state.checkpoints.num-retained this can grow as large as number retained >>> checkpoints times the checkpoint size. However, I doubt that this adds up >>> to several GB of additional space. >>> >>> In order to better understand the problem, the debug logs of your JM >>> could be helpful. Also a heap dump might be able to point us towards the >>> component which is eating up so much memory. >>> >>> Cheers, >>> Till >>> >>> On Thu, Oct 22, 2020 at 4:56 AM Eleanore Jin <eleanore....@gmail.com> >>> wrote: >>> >>>> Hi all, >>>> >>>> I have a flink job running version 1.10.2, it simply read from a kafka >>>> topic with 96 partitions, and output to another kafka topic. >>>> >>>> It is running in k8s, with 1 JM (not in HA mode), 12 task managers each >>>> has 4 slots. >>>> The checkpoint persists the snapshot to azure blob storage, checkpoints >>>> interval every 3 seconds, with 10 seconds timeout and minimum pause of 1 >>>> second. >>>> >>>> I observed that the job manager pod memory usage grows over time, any >>>> hints on why this is the case? And the memory usage for JM is significantly >>>> more compared to no checkpoint enabled. >>>> [image: image.png] >>>> >>>> Thanks a lot! >>>> Eleanore >>>> >>>