Hi, Patrick: We have encountered the same issue, that TaskManager's memory consumption increases almost monotonously.
I'll try to describe what we have observed and our solution. You can check if it would solve the problem. We have observed that 1. Jobs with RocksDB state backend would fail after a random period of time after deploying. All failures were TaskManager pod OOM killed by k8s. No jvm exceptions like "java.lang.OutOfMemoryError: Java heap space" have ever happened. 2. TaskManager pods were still OOM killed by k8s after setting kubernetes.taskmanager.memory.limit-factor to some number larger than 1.0, like 1.5 or 2.0. The config kubernetes.taskmanager.memory.limit-factor controls the ratio between memory limit and request submitted to k8s. 3. There would be a longer period of time before the first OOM killed event if we had requested more memory from k8s. container_memory_working_set_bytes increased almost monotonously. And pods got OOM killed when container_memory_working_set_bytes hitted the configured memory limit. 4. We used jeprof to profile RocksDB's native memory allocation. We found no memory leak. RocksDB's overall native memory consumption was less than managed memory configured. 5. container_memory_rss was way larger than memory requested. We gathered some statistics by using jcmd and jeprof. It turns out that rss was way larger than the memory size consumed by JVM and Jemalloc. So we draw conclusions from what we have observed 1. The memory issue is not caused by JVM. Since we would see JVM exceptions in log if it was caused by JVM. 2. We googled and found Flink cannot control RocksDB's memory consumption precisely so there will be some memory over-request. But we believe it's not our case since in our case we request 8GB memory and limit it to 16GB. We believe there may be some memory over-request but not that size. 3. We suspect that there may be a memory leak issue. We googled and there were many conversations about memory leaks caused by RocksDB. 4. But since the jeprof result shows no memory leak. We denied the suspicion that RocksDB caused a memory leak and caused the memory issue. 5. We suspect Jemalloc caused the memory issue. Since the jeprof result shows no memory issue but container_memory_rss indicates there's a memory issue. We suspect there's a gap between the memory statistic I mentioned above and the container_memory_rss metric. We did some google search and found that Jemalloc does not cope well with Transparent Huge Page, which is defaultly set to always at our host. Briefly speaking, Jemalloc requests huge pages (2MB pages but not 4KB pages) from the kernel but tells the kernel part of the 2MB page is not used so the kernel can free this part. But since the kernel does not split the huge page into normal 4KB pages, the kernel would never free the whole huge page if the whole huge page is not marked ok to be freed by the kernel. Many database systems like redis, mongo, oracle recommend to disable Transparent Huge Page. So we disabled this kernel function. After disabling Transparent Huge Page, we observed no memory issue anymore. Hope our experience will help you. On 2023/10/17 13:41:02 "Eifler, Patrick" wrote: > Hello, > > We are running Flink jobs on K8s and using RocksDB as state backend. It is connected to S3 for checkpointing. We have multiple states in the job (mapstate and value states). We are seeing a slow but stable increase over time on the memory consumption. We only see this in our jobs connected to RocksDB. > > > We are currently using the default memory setting (state-backend-rocksdb-memory-managed=true). Now we are wondering what a good alternative setting would be. We want to try to enable thestate.backend.rocksdb.memory.partitioned-index-filters but it only takes effect if the managed memory is turned off, so we need to figure out what would be a good amount for memory.fixed-per-slot. > > Any hint what a good indicator for that calculation would be? > Any other experience if someone has seen similar behavior before would also be much appreciated. > Thanks! > > Best Regards > > -- > Patrick Eifler > >