Hi to all, I'm trying to recursively read a directory but it seems that the totalLength value in the FileInputformat.createInputSplits() is not computed correctly..
I have a files organized as: /tmp/myDir/A/B/cunk-1.txt /tmp/myDir/A/B/cunk-2.txt .. If I try to do the following: Configuration parameters = new Configuration(); parameters.setBoolean("recursive.file.enumeration", true); env.readTextFile("file:////tmp/myDir)).withParameters(parameters).print(); I get: Caused by: org.apache.flink.runtime.JobException: Creating the input splits caused an error: Java heap space at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:162) at org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:471) at org.apache.flink.runtime.jobmanager.JobManager.org $apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:515) ... 19 more Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2219) at java.util.ArrayList.grow(ArrayList.java:242) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208) at java.util.ArrayList.add(ArrayList.java:440) at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:503) at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:51) at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:146) Am I doing something wrong or is it a bug? Best, Flavio