Hi Dani, The batch API does not expose an addSourse-like method, but you can always write your own inputformat and pass that directly to constructor of the DataSource. DataSource extends DataSet, so you will get all the usual methods in the end. For an example you can have a look e.g. here. [1]
[1] https://github.com/dataArtisans/flink-dataflow/blob/master/src/main/java/com/dataartisans/flink/dataflow/translation/FlinkTransformTranslators.java#L133 Best, Marton On Sun, Jun 14, 2015 at 4:34 PM, Dániel Bali <balijanosdan...@gmail.com> wrote: > Hello! > > We are running an experiment on a cluster and we have a large input split > into multiple files. We'd like to run a Flink job that reads the local file > on each instance and processes that. Is there a way to do this in the batch > environment? `readTextFile` wants to read the file on the JobManager and > split that right there, which is not what we want. > > We solved it in the streaming environment by using `addSource`, but there > is no similar function in the batch version. Does anybody know how this > could be done? > > Thanks! > Daniel >