Hi,

We are using Flink 1.3.1 in production, we have one job manager and 3 task managers in standalone mode. Recently, we've noticed that we have memory related problems. We use docker container to serve Flink cluster. We have 300 slots and 20 jobs are running with parallelism of 10. Also the job count may be change over time. Taskmanager memory usage always increases. After job cancelation this memory usage doesn't decrease. We've tried to investigate the problem and we've got the task manager jvm heap snapshot. According to the jam heap analysis, possible memory leak was Flink list state descriptor. But we are not sure that is the cause of our memory problem. How can we solve the problem?

Reply via email to