Seems like a good idea to collect these questions.

Stackoverflow is also a good place for "useful tricks"...

On Fri, Jun 26, 2015 at 12:25 PM, Michele Bertoni <
michele1.bert...@mail.polimi.it> wrote:

>  Got it!
> i will try thanks! :)
>
>  What about writing a section of it in the programming guide?
> I found a couple of topic about the readers in the mailing list, it seems
> it may be helpful
>
>
>
>  Il giorno 26/giu/2015, alle ore 12:21, Stephan Ewen <se...@apache.org>
> ha scritto:
>
>  Sure, just override the "createInputSplits()" method. Call for each of
> your file paths "super.createInputSplits()" and then combine the results
> into one array that you return.
>
>  That should do it...
>
> On Fri, Jun 26, 2015 at 12:19 PM, Michele Bertoni <
> michele1.bert...@mail.polimi.it> wrote:
>
>> Hi Stephan, thanks for answering,
>> right now I am using an extension of the DelimitedInputFormat, is there a
>> way to merge it with the option 2?
>>
>>
>>
>>  Il giorno 26/giu/2015, alle ore 12:17, Stephan Ewen <se...@apache.org>
>> ha scritto:
>>
>>  There are two ways you can realize that:
>>
>>  1) Create multiple sources and union them. This is easy, but probably a
>> bit less efficient.
>>
>>  2) Override the FileInputFormat's createInputSplits method to take a
>> union of the paths to create a list of all files and fils splits that will
>> be read.
>>
>>  Stephan
>>
>>
>> On Fri, Jun 26, 2015 at 12:12 PM, Michele Bertoni <
>> michele1.bert...@mail.polimi.it> wrote:
>>
>>> Hi everybody,
>>> is there a way to specify a list of URI (“hdfs://file1”,”hdfs://file2”,…)
>>> and open them as different files?
>>> I know i may open the entire directory, but i want to be able to select
>>> a subset of files in the directory
>>>
>>> thanks
>>
>>
>>
>>
>
>

Reply via email to