Hi Randal, Please consider to use jemalloc instead of glibc as default memory allocator [1] to avoid memory fragmentation. As far as I know, at least two groups of users, who run Flink on YARN and k8s respectively, have reported similar problem that memory continues growing up once restart [2]. The problem both went away once they adopt to use JeMalloc.
[1] https://issues.apache.org/jira/browse/FLINK-19125 [2] https://issues.apache.org/jira/browse/FLINK-18712 Best Yun Tang ________________________________ From: Lasse Nedergaard <lassenedergaardfl...@gmail.com> Sent: Wednesday, February 3, 2021 14:07 To: Xintong Song <tonysong...@gmail.com> Cc: user <user@flink.apache.org> Subject: Re: Memory usage increases on every job restart resulting in eventual OOMKill Hi We had something similar and our problem was class loader leaks. We used a summary log component to reduce logging but still turned out that it used a static object that wasn’t released when we got an OOM or restart. Flink was reusing task managers so only workaround was to stop the job wait until they was removed and start again until we fixed the underlying problem. Med venlig hilsen / Best regards Lasse Nedergaard Den 3. feb. 2021 kl. 02.54 skrev Xintong Song <tonysong...@gmail.com>: How is the memory measured? I meant which flink or k8s metric is collected? I'm asking because depending on which metric is used, the *container memory usage* can be defined differently. E.g., whether mmap memory is included. Also, could you share the effective memory configurations for the taskmanagers? You should find something like the following at the beginning of taskmanger logs. INFO [] - Final TaskExecutor Memory configuration: INFO [] - Total Process Memory: 1.688gb (1811939328 bytes) INFO [] - Total Flink Memory: 1.250gb (1342177280 bytes) INFO [] - Total JVM Heap Memory: 512.000mb (536870902 bytes) INFO [] - Framework: 128.000mb (134217728 bytes) INFO [] - Task: 384.000mb (402653174 bytes) INFO [] - Total Off-heap Memory: 768.000mb (805306378 bytes) INFO [] - Managed: 512.000mb (536870920 bytes) INFO [] - Total JVM Direct Memory: 256.000mb (268435458 bytes) INFO [] - Framework: 128.000mb (134217728 bytes) INFO [] - Task: 0 bytes INFO [] - Network: 128.000mb (134217730 bytes) INFO [] - JVM Metaspace: 256.000mb (268435456 bytes) INFO [] - JVM Overhead: 192.000mb (201326592 bytes) Thank you~ Xintong Song On Tue, Feb 2, 2021 at 8:59 PM Randal Pitt <randal.p...@foresite.com<mailto:randal.p...@foresite.com>> wrote: Hi Xintong Song, Correct, we are using standalone k8s. Task managers are deployed as a statefulset so have consistent pod names. We tried using native k8s (in fact I'd prefer to) but got persistent "io.fabric8.kubernetes.client.KubernetesClientException: too old resource version: 242214695 (242413759)" errors which resulted in jobs being restarted every 30-60 minutes. We are using Prometheus Node Exporter to capture memory usage. The graph shows the metric: sum(container_memory_usage_bytes{container_name="taskmanager",pod_name=~"$flink_task_manager"}) by (pod_name) I've attached the original <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2869/Screenshot_2021-02-02_at_11.png> so Nabble doesn't shrink it. Best regards, Randal. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/