The heap dump did not show anything too suspicious. The only thing I
noticed is that there are 13 ChildFirstClassLoaders whereas there are only
6 Task instances in the heap dump. Are you running all 13 tasks on the same
TaskExecutor?

Cheers,
Till

On Mon, Aug 24, 2020 at 2:01 PM Till Rohrmann <trohrm...@apache.org> wrote:

> What could also cause the problem is that the metaspace memory budget is
> configured too tightly. Here is a pointer to increasing the metaspace size
> [1].
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_trouble.html#outofmemoryerror-metaspace
>
> Cheers,
> Till
>
> On Mon, Aug 24, 2020 at 1:49 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
>> Hi,
>>
>> could you share with us the Flink cluster logs? This would help answering
>> a lot of questions around your setup and the Flink version you are using.
>> Thanks a lot!
>>
>> Cheers,
>> Till
>>
>> On Mon, Aug 24, 2020 at 10:48 AM 耿延杰 <gyj199...@qq.com> wrote:
>>
>>> Still failed after every 12 tasks.&nbsp;
>>> And the exception stack of failed tasks is different.
>>>
>>>
>>> such as the recent failed tasks's exception info:
>>> Caused by: java.lang.OutOfMemoryError: Metaspace
>>> &nbsp;&nbsp;&nbsp;&nbsp;at java.lang.ClassLoader.defineClass1(Native
>>> Method)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.lang.ClassLoader.defineClass(ClassLoader.java:757)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.net.URLClassLoader.access$100(URLClassLoader.java:74)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.net.URLClassLoader$1.run(URLClassLoader.java:369)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.net.URLClassLoader$1.run(URLClassLoader.java:363)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.security.AccessController.doPrivileged(Native Method)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.net.URLClassLoader.findClass(URLClassLoader.java:362)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init&gt;(ClickHouseConnectionImpl.java:80)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.sql.DriverManager.getConnection(DriverManager.java:664)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> java.sql.DriverManager.getConnection(DriverManager.java:270)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.api.java.io
>>> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> com.xxx.clickhouse.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at
>>> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>>> &nbsp;&nbsp;&nbsp;&nbsp;at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>>
>>>
>>> is different with the exception info in last email.
>>>
>>>
>>> So analyse the dump file is the key.
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------&nbsp;原始邮件&nbsp;------------------
>>> 发件人:
>>>                                                     "耿延杰"
>>>                                                                   <
>>> gyj199...@qq.com&gt;;
>>> 发送时间:&nbsp;2020年8月24日(星期一) 下午4:33
>>> 收件人:&nbsp;"dev"<dev@flink.apache.org&gt;;
>>>
>>> 主题:&nbsp;回复:OutOfMemoryError: Metaspace on Batch Task When Write into
>>> Clickhouse
>>>
>>>
>>>
>>> Additional info:
>>>
>>>
>>> The exception info in Flink Manager Page:
>>>
>>>
>>> Caused by: java.lang.OutOfMemoryError: Metaspace
>>> &nbsp;&nbsp;&nbsp; at java.lang.ClassLoader.defineClass1(Native Method)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.lang.ClassLoader.defineClass(ClassLoader.java:757)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.net.URLClassLoader.access$100(URLClassLoader.java:74)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.net.URLClassLoader$1.run(URLClassLoader.java:369)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.net.URLClassLoader$1.run(URLClassLoader.java:363)
>>> &nbsp;&nbsp;&nbsp; at java.security.AccessController.doPrivileged(Native
>>> Method)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.net.URLClassLoader.findClass(URLClassLoader.java:362)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init&gt;(ClickHouseConnectionImpl.java:80)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47)
>>> &nbsp;&nbsp;&nbsp; at
>>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.sql.DriverManager.getConnection(DriverManager.java:664)
>>> &nbsp;&nbsp;&nbsp; at
>>> java.sql.DriverManager.getConnection(DriverManager.java:270)
>>> &nbsp;&nbsp;&nbsp; at org.apache.flink.api.java.io
>>> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68)
>>> &nbsp;&nbsp;&nbsp; at
>>> com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>>> &nbsp;&nbsp;&nbsp; at
>>> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>>> &nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人:
>>>                                                     "耿延杰"
>>>                                                                   <
>>> gyj199...@qq.com&gt;;
>>> 发送时间:&nbsp;2020年8月24日(星期一) 下午4:20
>>> 收件人:&nbsp;"dev"<dev@flink.apache.org&gt;;
>>>
>>> 主题:&nbsp;OutOfMemoryError: Metaspace on Batch Task When Write into
>>> Clickhouse
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>> I catch&nbsp; "OutOfMemoryError: Metaspace" on Batch Task When Write
>>> into Clickhouse.
>>> Attached&nbsp; *.java file&nbsp; is my task code.
>>>
>>> And I find that, after running 12 tasks, the 13th task will be failed.
>>> And the exception always is "OutOfMemoryError: Metaspace". see
>>> "task-failed.png"
>>>
>>>
>>> I conf -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:HeapDumpPath=/path/to/hprofFile
>>> and dump the hprof file.
>>> I analyse this hprof file. And find this error occurs may not caused by
>>> my user-code.
>>> So I came here ask for your help. To confirm whether the memory leak
>>> should be caused by Flink.
>>>
>>>
>>> Attached file  "java_pid29294.hprof" is the dump file.
>>>
>>>
>>> Thanks.
>>
>>

Reply via email to