Hi! One thing you could try and do is create a dump of the JVM when it crashes, and have a look at all the classes it has loaded.
For these long-running sessions (that share JVMs across jobs) it is important that classes are properly unloaded. If someone keeps holding references to the classes (either the system, the user code, or a library like the JDBC connector lib), then unloading cannot happen. This would be one way to check that. Greetings, Stephan On Fri, Apr 15, 2016 at 10:21 AM, Balaji Rajagopalan < balaji.rajagopa...@olacabs.com> wrote: > Not a solution for your problem,but an alternative, I wrote my own sink > function where I handle all sql activities(insert/update/select), used a > 3rd lib for connection pooling, the code has been running stable in > production without any issue. > > On Fri, Apr 15, 2016 at 1:41 PM, Maximilian Bode < > maximilian.b...@tngtech.com> wrote: > >> Hi everyone, >> >> we are testing a long-running streaming application, which shares a yarn >> session with a batch job (containing JDBC(In|Out)putFormat) that is >> triggered periodically. Unfortunately, the session is dying after a few >> runs of the batch job. In fact, each run of the batch job kills one task >> manager due to OOME PermGen: >> -- >> 2016-04-14 16:53:55,212 INFO org.apache.flink.runtime.taskmanager.Task >> - DataSink >> (org.apache.flink.api.java.io.jdbc.JDBCOutputFormat@787c33b) (1/3) >> switched to FAILED with exception. >> java.lang.OutOfMemoryError: PermGen space >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:800) >> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) >> at java.net.URLClassLoader.access$100(URLClassLoader.java:71) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:800) >> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) >> at java.net.URLClassLoader.access$100(URLClassLoader.java:71) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at >> oracle.jdbc.driver.OraclePreparedStatement.<clinit>(OraclePreparedStatement.java:102) >> at >> oracle.jdbc.driver.T4CDriverExtension.allocatePreparedStatement(T4CDriverExtension.java:67) >> at >> oracle.jdbc.driver.PhysicalConnection.prepareStatement(PhysicalConnection.java:3523) >> at >> oracle.jdbc.driver.PhysicalConnection.prepareStatement(PhysicalConnection.java:3409) >> at >> org.apache.flink.api.java.io.jdbc.JDBCOutputFormat.open(JDBCOutputFormat.java:79) >> at >> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:186) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) >> at java.lang.Thread.run(Thread.java:744) >> 2016-04-14 16:53:55,489 ERROR org.apache.flink.runtime.taskmanager.Task >> - FATAL - exception in task exception handler >> java.lang.OutOfMemoryError: PermGen space >> at sun.misc.Unsafe.defineClass(Native Method) >> at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63) >> at >> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399) >> at >> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396) >> at java.security.AccessController.doPrivileged(Native Method) >> at >> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395) >> at >> sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113) >> at >> sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331) >> at >> java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376) >> at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72) >> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493) >> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) >> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >> at >> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:300) >> at >> org.apache.flink.runtime.util.SerializedThrowable.<init>(SerializedThrowable.java:83) >> at >> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:108) >> at >> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:78) >> at >> org.apache.flink.runtime.taskmanager.Task.notifyObservers(Task.java:865) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:616) >> at java.lang.Thread.run(Thread.java:744) >> 2016-04-14 16:53:55,489 ERROR org.apache.flink.runtime.taskmanager.Task >> - FATAL - exception in task exception handler >> java.lang.OutOfMemoryError: PermGen space >> at sun.misc.Unsafe.defineClass(Native Method) >> at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63) >> at >> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399) >> at >> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396) >> at java.security.AccessController.doPrivileged(Native Method) >> at >> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395) >> at >> sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113) >> at >> sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331) >> at >> java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376) >> at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72) >> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493) >> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464) >> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) >> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133) >> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >> at >> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:300) >> at >> org.apache.flink.runtime.util.SerializedThrowable.<init>(SerializedThrowable.java:83) >> at >> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:108) >> at >> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:78) >> at >> org.apache.flink.runtime.taskmanager.Task.notifyObservers(Task.java:865) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:616) >> at java.lang.Thread.run(Thread.java:744) >> >> -- >> This problem seems to be reproducible. In the first run it happens >> towards the end of the job in a JDBCOutputFormat. From then on, an >> analogous exception is thrown in the JDBCInputFormat, an earlier operator. >> >> We suspect there might be a memory leak caused by the Classloader, any >> ideas? >> >> Best regards, >> Max >> >> — >> Maximilian Bode * Software Consultant * maximilian.b...@tngtech.com >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring >> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke >> Sitz: Unterföhring * Amtsgericht München * HRB 135082 >> >> >