And what is the split policy for the FileInputFormat?it depends on the fs
block size?
Is there a pointer to the several flink input formats and a description of
their internals?

On Wed, Oct 7, 2015 at 3:09 PM, Fabian Hueske <fhue...@gmail.com> wrote:

> Hi Flavio,
>
> it is not possible to split by line count because that would mean to read
> and parse the file just for splitting.
>
> Parallel processing of data sources depends on the input splits created by
> the InputFormat. Local files can be split just like files in HDFS. Usually,
> each file corresponds to at least one split but multiple files could also
> be put into a single split if necessary.The logic for that would go into to
> the InputFormat.createInputSplits() method.
>
> Cheers, Fabian
>
> 2015-10-07 14:47 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>:
>
>> Hi to all,
>>
>> is there a way to split a single local file by line count (e.g. a split
>> every 100 lines) in a LocalEnvironment to speed up a simple map function?
>> For me it is not very clear how the local files (files into directory if
>> recursive=true) are managed by Flink..is there any ref to this internals?
>>
>> Best,
>> Flavio
>>
>
>

Reply via email to