Hi!

One thing you could try and do is create a dump of the JVM when it crashes,
and have a look at all the classes it has loaded.

For these long-running sessions (that share JVMs across jobs) it is
important that classes are properly unloaded.
If someone keeps holding references to the classes (either the system, the
user code, or a library like the JDBC connector lib), then unloading cannot
happen. This would be one way to check that.

Greetings,
Stephan


On Fri, Apr 15, 2016 at 10:21 AM, Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:

> Not a solution for your problem,but an alternative, I wrote my own sink
> function where I handle all sql activities(insert/update/select), used a
> 3rd lib for connection pooling, the code has been running stable in
> production without any issue.
>
> On Fri, Apr 15, 2016 at 1:41 PM, Maximilian Bode <
> maximilian.b...@tngtech.com> wrote:
>
>> Hi everyone,
>>
>> we are testing a long-running streaming application, which shares a yarn
>> session with a batch job (containing JDBC(In|Out)putFormat) that is
>> triggered periodically. Unfortunately, the session is dying after a few
>> runs of the batch job. In fact, each run of the batch job kills one task
>> manager due to OOME PermGen:
>> --
>> 2016-04-14 16:53:55,212 INFO  org.apache.flink.runtime.taskmanager.Task
>>                   - DataSink
>> (org.apache.flink.api.java.io.jdbc.JDBCOutputFormat@787c33b) (1/3)
>> switched to FAILED with exception.
>> java.lang.OutOfMemoryError: PermGen space
>> at java.lang.ClassLoader.defineClass1(Native Method)
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> at java.lang.ClassLoader.defineClass1(Native Method)
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> at
>> oracle.jdbc.driver.OraclePreparedStatement.<clinit>(OraclePreparedStatement.java:102)
>> at
>> oracle.jdbc.driver.T4CDriverExtension.allocatePreparedStatement(T4CDriverExtension.java:67)
>> at
>> oracle.jdbc.driver.PhysicalConnection.prepareStatement(PhysicalConnection.java:3523)
>> at
>> oracle.jdbc.driver.PhysicalConnection.prepareStatement(PhysicalConnection.java:3409)
>> at
>> org.apache.flink.api.java.io.jdbc.JDBCOutputFormat.open(JDBCOutputFormat.java:79)
>> at
>> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:186)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>> at java.lang.Thread.run(Thread.java:744)
>> 2016-04-14 16:53:55,489 ERROR org.apache.flink.runtime.taskmanager.Task
>>                   - FATAL - exception in task exception handler
>> java.lang.OutOfMemoryError: PermGen space
>> at sun.misc.Unsafe.defineClass(Native Method)
>> at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
>> at
>> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
>> at
>> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at
>> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
>> at
>> sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113)
>> at
>> sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331)
>> at
>> java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376)
>> at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72)
>> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493)
>> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
>> at
>> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:300)
>> at
>> org.apache.flink.runtime.util.SerializedThrowable.<init>(SerializedThrowable.java:83)
>> at
>> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:108)
>> at
>> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:78)
>> at
>> org.apache.flink.runtime.taskmanager.Task.notifyObservers(Task.java:865)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:616)
>> at java.lang.Thread.run(Thread.java:744)
>> 2016-04-14 16:53:55,489 ERROR org.apache.flink.runtime.taskmanager.Task
>>                   - FATAL - exception in task exception handler
>> java.lang.OutOfMemoryError: PermGen space
>> at sun.misc.Unsafe.defineClass(Native Method)
>> at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
>> at
>> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
>> at
>> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at
>> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
>> at
>> sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:113)
>> at
>> sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:331)
>> at
>> java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1376)
>> at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:72)
>> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:493)
>> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:464)
>> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
>> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1133)
>> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
>> at
>> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:300)
>> at
>> org.apache.flink.runtime.util.SerializedThrowable.<init>(SerializedThrowable.java:83)
>> at
>> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:108)
>> at
>> org.apache.flink.runtime.taskmanager.TaskExecutionState.<init>(TaskExecutionState.java:78)
>> at
>> org.apache.flink.runtime.taskmanager.Task.notifyObservers(Task.java:865)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:616)
>> at java.lang.Thread.run(Thread.java:744)
>>
>> --
>> This problem seems to be reproducible. In the first run it happens
>> towards the end of the job in a JDBCOutputFormat. From then on, an
>> analogous exception is thrown in the JDBCInputFormat, an earlier operator.
>>
>> We suspect there might be a memory leak caused by the Classloader, any
>> ideas?
>>
>> Best regards,
>> Max
>>
>> —
>> Maximilian Bode * Software Consultant * maximilian.b...@tngtech.com
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>

Reply via email to