I'm sorry there is no such documentation. You need to look at the code :-( 2015-10-07 15:19 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:
> And what is the split policy for the FileInputFormat?it depends on the fs > block size? > Is there a pointer to the several flink input formats and a description of > their internals? > > On Wed, Oct 7, 2015 at 3:09 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi Flavio, >> >> it is not possible to split by line count because that would mean to read >> and parse the file just for splitting. >> >> Parallel processing of data sources depends on the input splits created >> by the InputFormat. Local files can be split just like files in HDFS. >> Usually, each file corresponds to at least one split but multiple files >> could also be put into a single split if necessary.The logic for that would >> go into to the InputFormat.createInputSplits() method. >> >> Cheers, Fabian >> >> 2015-10-07 14:47 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >> >>> Hi to all, >>> >>> is there a way to split a single local file by line count (e.g. a split >>> every 100 lines) in a LocalEnvironment to speed up a simple map function? >>> For me it is not very clear how the local files (files into directory if >>> recursive=true) are managed by Flink..is there any ref to this internals? >>> >>> Best, >>> Flavio >>> >> >> > >