Hi everyone,
I am facing a problem using the JDBCInputFormat which occurred in a larger
Flink job. As a minimal example I can reproduce it when just writing data into
a csv after having read it from a database, i.e.
DataSet<Tuple1<String>> existingData = env.createInput(
JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("oracle.jdbc.driver.OracleDriver")
.setUsername(…)
.setPassword(…)
.setDBUrl(…)
.setQuery("select DATA from TABLENAME")
.finish(),
new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
existingData.writeAsCsv(…);
where DATA is a column containing strings of length ~25 characters and
TABLENAME contains 20 million rows.
After starting the job on a YARN cluster (using -tm 3072 and leaving the other
memory settings at default values), Flink happily goes along at first but then
fails after something like three million records have been sent by the
JDBCInputFormat. The Exception reads "The slot in which the task was executed
has been released. Probably loss of TaskManager …". The local taskmanager.log
in the affected container reads
"java.lang.OutOfMemoryError: GC overhead limit exceeded
at
java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
at
org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
at
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)"
Any ideas what is going wrong here?
Cheers,
Max
—
Maximilian Bode * Junior Consultant * [email protected]
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082