Did you try the standard JsonLoader? I didn't personally use it but it
looks like you can specify the schema to extract/parse from your json.

http://pig.apache.org/docs/r0.13.0/func.html#jsonloadstore

If not, you can also look at the following example I found googling:

https://gist.github.com/kimsterv/601331


Thanks.




On Fri, Jul 25, 2014 at 8:01 AM, praveenesh kumar <[email protected]>
wrote:

> One simple way is to write a UDF that will act as Json parser. Load your
> data and then call your UDF to parse and extract whatever you want from the
> Json. You need to build what you want to get. Pig doesn't do that for you,
> it gives you the capability to do that. How you do is upto you.
>
>
> On Fri, Jul 25, 2014 at 12:09 PM, unmesha sreeveni <[email protected]>
> wrote:
>
> > Hi
> >
> > This is my code for sampling
> >
> > *--Load data*
> > *inputdata = LOAD '$input' using PigStorage('$delimiter');*
> >
> > *--Group data*
> > *groupedByAll = group inputdata all;*
> >
> > *--output into hdfs*
> > *sampled = SAMPLE inputdata $fraction;*
> > *store sampled into '$output' using PigStorage('$delimiter'); *
> >
> >  --Sampling.pig
> > --pig -x mapreduce -f Sampling.pig -param input=foo.csv -param
> > output=OUT/pig -param delimiter="," -param fraction='0.05'
> >
> > --Load data
> > inputdata = LOAD '$input' using PigStorage('$delimiter');
> >
> > --Group data
> > groupedByAll = group inputdata all;
> >
> > --output into hdfs
> > sampled = SAMPLE inputdata $fraction;
> > store sampled into '$output' using PigStorage('$delimiter');
> >
> > I am taking input parameters as customized
> > pig -x mapreduce -f Sampling.pig -param input=foo.csv -param
> output=OUT/pig
> > -param delimiter="," -param fraction='0.05'
> >
> > I would like to do a modification in the same
> > I am trying to take my input as json
> >
> > sample json:
> >
> >
> *{"Name":"sampling","elementInfo":{"fraction":"3"},"destination":"/user/sree/OUT","source":"/user/sree/foo.txt"}*
> >
> > Now I need to parse the above json and take the needful params.
> > How to do the same
> > I know we can load json in apache pig but how to extract the needful from
> > the json
> >
> > from here I only need
> > fraction,destination,source
> >
> > Please suggest a way
> >
> > --
> > *Thanks & Regards *
> >
> >
> > *Unmesha Sreeveni U.B*
> > *Hadoop, Bigdata Developer*
> > *Center for Cyber Security | Amrita Vishwa Vidyapeetham*
> > http://www.unmeshasreeveni.blogspot.in/
> >
>

Reply via email to