If you find you need something smarter than pig's handling, you can use a
shell script and pass in at the commandline: -param myfile=somefile and put
$myfile in your load statement.
On Oct 3, 2014 1:58 PM, "hanif mahboobi" <[email protected]>
wrote:

>
>
> Sure,
>
> table = LOAD 'folder/*KeyWord*' USING
> org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE',
> 'NOCHANGE', 'SKIP_INPUT_HEADER') AS (rec: chararray);
>
>
>
> On Friday, October 3, 2014 4:51 PM, Bob Metelsky <[email protected]>
> wrote:
>
>
>
> can you post what you did/used?
>
>
> On Fri, Oct 3, 2014 at 4:41 PM, hanif mahboobi <
> [email protected]> wrote:
>
> > Hi Praveen,
> >
> > Thanks for the reply.
> > In fact after a minor debugging, I could make it work and it could read
> in
> > using the glob pattern.
> >
> > Best,
> > Hanif
> >
> >
> >
> > On Friday, October 3, 2014 3:08 AM, Praveen R <
> > [email protected]> wrote:
> >
> >
> >
> > Looks like Pig load doesn't support glob patterns.
> >
> > I guess you would need to write a custom loader to achieve this.
> >
> >
> > On Fri, Oct 3, 2014 at 3:25 AM, hanif mahboobi <
> > [email protected]> wrote:
> >
> > > Hi There,
> > >
> > > Here is my problem.
> > > I have a folder with thousands of files in it. I just want to load
> > certain
> > > subset of them which have a specific string in their names (in this
> > example
> > > "site").
> > >
> > > Knowing that something like this does not work:
> > > table = LOAD 'folder/*site*' USING ...
> > >
> > > Can anybody help with that?
> > >
> > > Thanks,
> > > Hanif

Reply via email to