My application program looks like this. Does this structure has
some problem?
public class StreamingJob {
public static void main(String[] args) throws Exception {
int i = 0;
while (i < 100) {
try {
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);
env.setParallelism(Parallelism);
EnvironmentSettings bsSettings =
EnvironmentSettings.newInstance().useBlinkPlanner()
.inStreamingMode().build();
StreamTableEnvironment bsTableEnv =
StreamTableEnvironment.create(env, bsSettings);
bsTableEnv.executeSql("CREATE TEMPORARY TABLE xxxx");
Table t = bsTableEnv.sqlQuery(query);
DataStream<DataPoint> points = bsTableEnv.toAppendStream(t,
DataPoint.class);
DataStream<StatisPoint> weightPoints = points.map();
DataStream<PredictPoint> predictPoints = weightPoints.keyBy()
.reduce().map();
// side output
final OutputTag<PredictPoint> outPutPredict = new
OutputTag<PredictPoint>("predict") {
};
SingleOutputStreamOperator<PredictPoint> mainDataStream =
predictPoints
.process();
DataStream<PredictPoint> exStream =
mainDataStream.getSideOutput(outPutPredict);
//write data to clickhouse
String insertIntoCKSql = "xxx";
mainDataStream.addSink(JdbcSink.sink(insertIntoCKSql, new
CkSinkBuilder(),
new
JdbcExecutionOptions.Builder().withBatchSize(CkBatchSize).build(),
new
JdbcConnectionOptions.JdbcConnectionOptionsBuilder().withDriverName(CkDriverName)
.withUrl(CkUrl).withUsername(CkUser).withPassword(CkPassword).build()));
// write data to kafka
FlinkKafkaProducer<String> producer = new FlinkKafkaProducer<>();
exStream.map().addSink(producer);
env.execute("Prediction Program");
} catch (Exception e) {
e.printStackTrace();
}
i++;
Thread.sleep(window * 1000);
}
}
}
------------------ 原始邮件 ------------------
*发件人:* "Arvid Heise" <ar...@apache.org> <mailto:ar...@apache.org>;
*发送时间:* 2021年4月8日(星期四) 下午2:33
*收件人:* "Yangze Guo"<karma...@gmail.com>
<mailto:karma...@gmail.com>;
*抄送:* "太平洋"<495635...@qq.com>
<mailto:495635...@qq.com>;"user"<user@flink.apache.org>
<mailto:user@flink.apache.org>;"guowei.mgw"<guowei....@gmail.com>
<mailto:guowei....@gmail.com>;"renqschn"<renqs...@gmail.com>
<mailto:renqs...@gmail.com>;
*主题:* Re: period batch job lead to OutOfMemoryError: Metaspace
problem
Hi,
ChildFirstClassLoader are created (more or less) by application
jar and seeing so many looks like a classloader leak to me. I'd
expect you to see a new ChildFirstClassLoader popping up with
each new job submission.
Can you check who is referencing the ChildFirstClassLoader
transitively? Usually, it's some thread that is lingering around
because some third party library is leaking threads etc.
OneInputStreamTask is legit and just indicates that you have a
job running with 4 slots on that TM. It should not hold any
dedicated metaspace memory.
On Thu, Apr 8, 2021 at 4:52 AM Yangze Guo <karma...@gmail.com
<mailto:karma...@gmail.com>> wrote:
I went through the JM & TM logs but could not find any
valuable clue.
The exception is actually thrown by
kafka-producer-network-thread.
Maybe @Qingsheng could also take a look?
Best,
Yangze Guo
On Thu, Apr 8, 2021 at 10:39 AM 太平洋 <495635...@qq.com
<mailto:495635...@qq.com>> wrote:
>
> I have configured to 512M, but problem still exist. Now the
memory size is still 256M.
> Attachments are TM and JM logs.
>
> Look forward to your reply.
>
> ------------------ 原始邮件 ------------------
> 发件人: "Yangze Guo" <karma...@gmail.com
<mailto:karma...@gmail.com>>;
> 发送时间: 2021年4月6日(星期二) 晚上6:35
> 收件人: "太平洋"<495635...@qq.com <mailto:495635...@qq.com>>;
> 抄送: "user"<user@flink.apache.org
<mailto:user@flink.apache.org>>;"guowei.mgw"<guowei....@gmail.com
<mailto:guowei....@gmail.com>>;
> 主题: Re: period batch job lead to OutOfMemoryError:
Metaspace problem
>
> > I have tried this method, but the problem still exist.
> How much memory do you configure for it?
>
> > is 21 instances of
"org.apache.flink.util.ChildFirstClassLoader" normal
> Not quite sure about it. AFAIK, each job will have a
classloader.
> Multiple tasks of the same job in the same TM will share
the same
> classloader. The classloader will be removed if there is no
more task
> running on the TM. Classloader without reference will be
finally
> cleanup by GC. Could you share JM and TM logs for further
analysis?
> I'll also involve @Guowei Ma in this thread.
>
>
> Best,
> Yangze Guo
>
> On Tue, Apr 6, 2021 at 6:05 PM 太平洋 <495635...@qq.com
<mailto:495635...@qq.com>> wrote:
> >
> > I have tried this method, but the problem still exist.
> > by heap dump analysis, is 21 instances of
"org.apache.flink.util.ChildFirstClassLoader" normal?
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "Yangze Guo" <karma...@gmail.com
<mailto:karma...@gmail.com>>;
> > 发送时间: 2021年4月6日(星期二) 下午4:32
> > 收件人: "太平洋"<495635...@qq.com <mailto:495635...@qq.com>>;
> > 抄送: "user"<user@flink.apache.org
<mailto:user@flink.apache.org>>;
> > 主题: Re: period batch job lead to OutOfMemoryError:
Metaspace problem
> >
> > I think you can try to increase the JVM metaspace option for
> > TaskManagers through
taskmanager.memory.jvm-metaspace.size. [1]
> >
> > [1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-metaspace
<https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-metaspace>
> >
> > Best,
> > Yangze Guo
> >
> > Best,
> > Yangze Guo
> >
> >
> > On Tue, Apr 6, 2021 at 4:22 PM 太平洋 <495635...@qq.com
<mailto:495635...@qq.com>> wrote:
> > >
> > > batch job:
> > > read data from s3 by sql,then by some operators and
write data to clickhouse and kafka.
> > > after some times, task-manager quit with
OutOfMemoryError: Metaspace.
> > >
> > > env:
> > > flink version:1.12.2
> > > task-manager slot count: 5
> > > deployment: standalone kubernetes session 模式
> > > dependencies:
> > >
> > > <dependency>
> > >
> > > <groupId>org.apache.flink</groupId>
> > >
> > > <artifactId>flink-connector-kafka_2.11</artifactId>
> > >
> > > <version>${flink.version}</version>
> > >
> > > </dependency>
> > >
> > > <dependency>
> > >
> > > <groupId>com.google.code.gson</groupId>
> > >
> > > <artifactId>gson</artifactId>
> > >
> > > <version>2.8.5</version>
> > >
> > > </dependency>
> > >
> > > <dependency>
> > >
> > > <groupId>org.apache.flink</groupId>
> > >
> > > <artifactId>flink-connector-jdbc_2.11</artifactId>
> > >
> > > <version>${flink.version}</version>
> > >
> > > </dependency>
> > >
> > > <dependency>
> > >
> > > <groupId>ru.yandex.clickhouse</groupId>
> > >
> > > <artifactId>clickhouse-jdbc</artifactId>
> > >
> > > <version>0.3.0</version>
> > >
> > > </dependency>
> > >
> > > <dependency>
> > >
> > > <groupId>org.apache.flink</groupId>
> > >
> > > <artifactId>flink-parquet_2.11</artifactId>
> > >
> > > <version>${flink.version}</version>
> > >
> > > </dependency>
> > >
> > > <dependency>
> > >
> > > <groupId>org.apache.flink</groupId>
> > >
> > > <artifactId>flink-json</artifactId>
> > >
> > > <version>${flink.version}</version>
> > >
> > > </dependency>
> > >
> > >
> > > heap dump1:
> > >
> > > Leak Suspects
> > >
> > > System Overview
> > >
> > > Leaks
> > >
> > > Overview
> > >
> > >
> > > Problem Suspect 1
> > >
> > > 21 instances of
"org.apache.flink.util.ChildFirstClassLoader", loaded by
"sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy
29,656,880 (41.16%) bytes.
> > >
> > > Biggest instances:
> > >
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73ca2a1e8 - 1,474,760 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d2af820 - 1,474,168 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73cdcaa10 - 1,474,160 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73cf6aab0 - 1,474,160 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d1111d8 - 1,474,160 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d2bb108 - 1,474,128 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73de202e0 - 1,474,120 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73dadc778 - 1,474,112 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d5f70e8 - 1,474,064 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d93aa38 - 1,474,064 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e179638 - 1,474,064 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73dc80418 - 1,474,056 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73dfcda60 - 1,474,056 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e4bcd38 - 1,474,056 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d6006e8 - 1,474,032 (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73c7d2ad8 - 1,461,944 (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73ca1bb98 - 1,460,752 (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73bf203f0 - 1,460,744 (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e3284a8 - 1,445,232 (2.01%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e65de00 - 1,445,232 (2.01%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > org.apache.flink.util.ChildFirstClassLoader
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > Details »
> > >
> > > Problem Suspect 2
> > >
> > > 34,407 instances of
"org.apache.flink.core.memory.HybridMemorySegment", loaded by
"sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy
7,707,168 (10.70%) bytes.
> > >
> > > Keywords
> > > org.apache.flink.core.memory.HybridMemorySegment
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > >
> > > Details »
> > >
> > >
> > >
> > > heap dump2:
> > >
> > > Leak Suspects
> > >
> > > System Overview
> > >
> > > Leaks
> > >
> > > Overview
> > >
> > > Problem Suspect 1
> > >
> > > 21 instances of
"org.apache.flink.util.ChildFirstClassLoader", loaded by
"sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy
26,061,408 (30.68%) bytes.
> > >
> > > Biggest instances:
> > >
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e9e9930 - 1,474,224 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73edce0b8 - 1,474,224 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73f1ad7d0 - 1,474,168 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73f3e5118 - 1,474,168 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73f5d3fe0 - 1,474,168 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73ebd8d28 - 1,474,160 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73efc00c0 - 1,474,160 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73e2251a8 - 1,474,136 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73cc24af0 - 1,474,064 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73cdca3e0 - 1,474,064 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73cf6f860 - 1,474,064 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d114768 - 1,474,064 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73ca6f878 - 1,474,056 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d2b7640 - 1,474,056 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73d2c1d80 - 1,474,040 (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73c7e2868 - 1,469,720 (1.73%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @
0x73bf34a98 - 1,460,808 (1.72%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > org.apache.flink.util.ChildFirstClassLoader
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > Details »
> > >
> > > Problem Suspect 2
> > >
> > > 4 instances of
"org.apache.flink.streaming.runtime.tasks.OneInputStreamTask",
loaded by "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0"
occupy 11,644,200 (13.71%) bytes.
> > >
> > > Biggest instances:
> > >
> > >
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @
0x73e2d0cb0 - 4,364,536 (5.14%) bytes.
> > >
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @
0x73d62fb88 - 3,643,576 (4.29%) bytes.
> > >
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @
0x73dae0270 - 3,635,952 (4.28%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask
> > > Details »
> > >
> > >