Hello Team, Good morning! We have been running a flink job with Kafka where it gets restarted every 2 hours with an Out of Memory Exception. We tried to increase task manager memory and reduce parallelism and having rate limit to reduce consumption rate, but irrespectively, it restarts every 1-2 hours. I see the job application running fine if the payload size is smaller and failing if it is nearly 700–900 KB. I tried to enable heap memory dump to see if there were any leaks, but I am not able to see any files being generated. Can someone help here? Flink Command to generate heap dump which is not working flink run-application -t yarn-application -Drest.flamegraph.enabled=true -Denv.java.opts.all="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/logs/flink/kafka-to-pubsub-test.hprof" -Dyarn.application.name=idx-pfm-user-financial-data-kafka-to-pubsub-prd -Dtaskmanager.memory.process.size=4g -Dtaskmanager.numberOfTaskSlots=1 -Djobmanager.memory.process.size=4g -c KafkaToPubSubJob test.jar
Caused by: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) ~[?:1.8.0_412] at java.lang.Thread.start(Thread.java:719) ~[?:1.8.0_412] at org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) ~[test.jar:?] Regards,Madan