Rohit Singh created FLINK-9080:
----------------------------------

             Summary: Flink Scheduler goes OOM, suspecting a memory leak
                 Key: FLINK-9080
                 URL: https://issues.apache.org/jira/browse/FLINK-9080
             Project: Flink
          Issue Type: Bug
          Components: JobManager
    Affects Versions: 1.4.0
            Reporter: Rohit Singh
         Attachments: Classloaded vs unloaded.png, Top Level packages.JPG, Top 
level classes.JPG

Running FLink version 1.4.0. on mesos,scheduler running along  with job manager 
in single container, whereas task managers running in seperate containers.

Couple of jobs were running continously, Flink scheduler was working 
properlyalong with task managers. Due to some change in data, one of the jobs 
started failing continuously. In the meantime,there was a surge in  flink 
scheduler memory usually eventually died out off OOM

 

Memory dump analysis was done, 

Following were findings !Top Level packages.JPG!!Top level 
classes.JPG!!Classloaded vs unloaded.png!
 *  Majority of top loaded packages retaining heap indicated towards 
Flinkuserclassloader, glassfish(jersey library), Finalizer classes. (Top level 
package image)
 * Top level classes were of Flinkuserclassloader, (Top Level class image)
 * The number of classes loaded vs unloaded was quite less  PFA,inspite of 
adding jvm options of -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -
 * There were custom classes as well which were duplicated during subsequent 
class uploads

PFA all the images of heap dump.  Can you suggest some pointers on as to how to 
overcome this issue.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to