Re: open multiple file from list of uri

2015-07-17 Thread Stephan Ewen
You are right, the implementation needs a place holder here. The placeholder can probably be a "fake path", like "file:///this/will/never/be/read/anyways", because you override the "createSplits" method... On Thu, Jul 16, 2015 at 12:03 AM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote:

Re: open multiple file from list of uri

2015-07-15 Thread Michele Bertoni
uhm, it doesn’t seem to work: it calls the configure() method that checks if filePath is null and throws an exception Actually i set that field only during the createInputSplits that is some steps later Il giorno 15/lug/2015, alle ore 13:16, Stephan Ewen mailto:se...@apache.org>> ha scritto:

Re: open multiple file from list of uri

2015-07-15 Thread Stephan Ewen
If you want to work without the placeholder, simply do: "env.createInput(new myDelimitedInputFormat(parser)(paths)) The "createInputSplits()" method looks good. Greetings, Stephan On Tue, Jul 14, 2015 at 11:42 PM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: > Ok thank you, now I

Re: open multiple file from list of uri

2015-07-14 Thread Michele Bertoni
Ok thank you, now I solved it! The problem was in the env.readFile(myInputFormat, path) now that path is actually a list of paths what should I pass it? I solved in this way env.readFile(new myDelimitedInputFormat(parser)(paths), paths.head) where that paths.head gives to the read file a ur

Re: open multiple file from list of uri

2015-07-14 Thread Stephan Ewen
For the approach that I outlined, you need to subclass of the file input format. In that subclass, you store the list of URIs (in a new variable), and override the "createInputSplits()" method. Stephan On Tue, Jul 14, 2015 at 6:42 PM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: >

Re: open multiple file from list of uri

2015-07-14 Thread Michele Bertoni
Hi Stephan, I started working on this today, but I am having a problem Can you be a little more detailed in the procedure? actually I don’t understand how to give to the input format the list of URI since it will try putting it in a Path variable createinputsplit does not receive the path but ta

Re: open multiple file from list of uri

2015-06-26 Thread Michele Bertoni
Right! later I will do the question and quoting your answer with the solution :) Il giorno 26/giu/2015, alle ore 12:27, Stephan Ewen mailto:se...@apache.org>> ha scritto: Seems like a good idea to collect these questions. Stackoverflow is also a good place for "useful tricks"... On Fri, Jun 26

Re: open multiple file from list of uri

2015-06-26 Thread Stephan Ewen
Seems like a good idea to collect these questions. Stackoverflow is also a good place for "useful tricks"... On Fri, Jun 26, 2015 at 12:25 PM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: > Got it! > i will try thanks! :) > > What about writing a section of it in the programming g

Re: open multiple file from list of uri

2015-06-26 Thread Michele Bertoni
Got it! i will try thanks! :) What about writing a section of it in the programming guide? I found a couple of topic about the readers in the mailing list, it seems it may be helpful Il giorno 26/giu/2015, alle ore 12:21, Stephan Ewen mailto:se...@apache.org>> ha scritto: Sure, just override

Re: open multiple file from list of uri

2015-06-26 Thread Stephan Ewen
Sure, just override the "createInputSplits()" method. Call for each of your file paths "super.createInputSplits()" and then combine the results into one array that you return. That should do it... On Fri, Jun 26, 2015 at 12:19 PM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: > Hi S

Re: open multiple file from list of uri

2015-06-26 Thread Michele Bertoni
Hi Stephan, thanks for answering, right now I am using an extension of the DelimitedInputFormat, is there a way to merge it with the option 2? Il giorno 26/giu/2015, alle ore 12:17, Stephan Ewen mailto:se...@apache.org>> ha scritto: There are two ways you can realize that: 1) Create multiple

Re: open multiple file from list of uri

2015-06-26 Thread Stephan Ewen
There are two ways you can realize that: 1) Create multiple sources and union them. This is easy, but probably a bit less efficient. 2) Override the FileInputFormat's createInputSplits method to take a union of the paths to create a list of all files and fils splits that will be read. Stephan

open multiple file from list of uri

2015-06-26 Thread Michele Bertoni
Hi everybody, is there a way to specify a list of URI (“hdfs://file1”,”hdfs://file2”,…) and open them as different files? I know i may open the entire directory, but i want to be able to select a subset of files in the directory thanks