Hey everyone, I just noticed that when processing input splits from a DelimitedInputFormat (specifically, I have a text file with words in it), that if the splitLength is 0, the entire readbuffer is filled (see https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/DelimitedInputFormat.java#L577). I'm using XtreemFS as underlying file system, which stripes files in blocks of 128kb across storage servers. I have 8 physically separate nodes, and my input file is 1MB, such that each node stores 128kb of data. This is reported accurately to Flink (e.g. split sizes and hostnames). Now when the splitLength is 0 at some point during processing (which it will become eventually), the entire file is read in again, which kind of defeats the point of processing a split of length 0. Is this intended behavior? I've tried multiple hot-fixes, but they ended up in the file not bein read in its entirety. I would like to know the rationale behind this implementation, and maybe figure out a way around it. Thanks in advance,
Robert -- My GPG Key ID: 336E2680