hi, thanks for your reply, we are storing the data in memory since it is a short term we thought that adding rocksdb will add overhead.
On Thu, May 23, 2024 at 4:38 PM Sachin Mittal <sjmit...@gmail.com> wrote: > Hi > Where are you storing the state. > Try rocksdb. > > Thanks > Sachin > > > On Thu, 23 May 2024 at 6:19 PM, Sigalit Eliazov <e.siga...@gmail.com> > wrote: > >> Hi, >> >> I am trying to understand the following behavior in our Flink application >> cluster. Any assistance would be appreciated. >> >> We are running a Flink application cluster with 5 task managers, each >> with the following configuration: >> >> - jobManagerMemory: 12g >> - taskManagerMemory: 20g >> - taskManagerMemoryHeapSize: 12g >> - taskManagerMemoryNetworkMax: 4g >> - taskManagerMemoryNetworkMin: 1g >> - taskManagerMemoryManagedSize: 50m >> - taskManagerMemoryOffHeapSize: 2g >> - taskManagerMemoryNetworkFraction: 0.2 >> - taskManagerNetworkMemorySegmentSize: 4mb >> - taskManagerMemoryFloatingBuffersPerGate: 64 >> - taskmanager.memory.jvm-overhead.min: 256mb >> - taskmanager.memory.jvm-overhead.max: 2g >> - taskmanager.memory.jvm-overhead.fraction: 0.1 >> >> Our pipeline includes stateful transformations, and we are verifying that >> we clear the state once it is no longer needed. >> >> Through the Flink UI, we observe that the heap size increases and >> decreases during the job lifecycle. >> >> However, there is a noticeable delay between clearing the state and the >> reduction in heap size usage, which I assume is related to the garbage >> collector frequency. >> >> What is puzzling is the task manager pod memory usage. It appears that >> the memory usage increases intermittently and is not released. We verified >> the different state metrics and confirmed they are changing according to >> the logic. >> >> Additionally, if we had a state that was never released, I would expect >> to see the heap size increasing constantly as well. >> >> Any insights or ideas? >> >> Thanks, >> >> Sigalit >> >