[ 
https://issues.apache.org/jira/browse/FLINK-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414093#comment-16414093
 ] 

Rohit Singh commented on FLINK-9080:
------------------------------------

Based on documentation on Flink, Tried adding job in the flink lib of scheduler 
and task manager  to avoid dynamic class loading 

https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/debugging_classloading.html

Getting following error
{code:java}
Class=o.a.f.r.e.ExecutionGraph Msg=Source: Custom Source -> Sink: Unnamed (1/1) 
(3f12f6953a235eb43f07cdf7966b5fcf) switched from RUNNING to FAILED.
org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot 
instantiate user function.
at 
org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:235)
 ~[iot-mirror-device.jar:na]
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95)
 ~[iot-mirror-device.jar:na]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:231) 
~[iot-mirror-device.jar:na]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) 
~[iot-mirror-device.jar:na]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_91]
Caused by: java.lang.ClassCastException: cannot assign instance of 
org.apache.commons.collections.map.LinkedMap to field 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.pendingOffsetsToCommit
 of type org.apache.commons.collections.map.LinkedMap in instance of 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010
at 
java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
 ~[na:1.8.0_91]
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
~[na:1.8.0_91]
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) 
~[na:1.8.0_91]
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:290)
 ~[iot-mirror-device.jar:na]
{code}
 

 

> Flink Scheduler goes OOM, suspecting a memory leak
> --------------------------------------------------
>
>                 Key: FLINK-9080
>                 URL: https://issues.apache.org/jira/browse/FLINK-9080
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.4.0
>            Reporter: Rohit Singh
>            Priority: Critical
>         Attachments: Top Level packages.JPG, Top level classes.JPG, 
> classesloaded vs unloaded.png
>
>
> Running FLink version 1.4.0. on mesos,scheduler running along  with job 
> manager in single container, whereas task managers running in seperate 
> containers.
> Couple of jobs were running continously, Flink scheduler was working 
> properlyalong with task managers. Due to some change in data, one of the jobs 
> started failing continuously. In the meantime,there was a surge in  flink 
> scheduler memory usually eventually died out off OOM
>  
> Memory dump analysis was done, 
> Following were findings  !Top Level packages.JPG!!Top level classes.JPG!
>  *  Majority of top loaded packages retaining heap indicated towards 
> Flinkuserclassloader, glassfish(jersey library), Finalizer classes. (Top 
> level package image)
>  * Top level classes were of Flinkuserclassloader, (Top Level class image)
>  * The number of classes loaded vs unloaded was quite less  PFA,inspite of 
> adding jvm options of -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled , 
> PFAclassloaded vs unloaded graph, scheduler was restarted 3 times
>  * There were custom classes as well which were duplicated during subsequent 
> class uploads
> PFA all the images of heap dump.  Can you suggest some pointers on as to how 
> to overcome this issue.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to