We are running EB in production with Pig 0.11 against CDH3. Hadoop 2 is a different story -- lots of things need to change to have that work. Raghu has a branch that makes EB changes: https://github.com/rangadi/elephant-bird/tree/hadoop-2.0-support
On Thu, Apr 4, 2013 at 6:39 PM, Ruslan Al-Fakikh <[email protected]>wrote: > Hi guys, > > As for elephant-bird, it seems that it is not compatible with Pig 0.10 > (CDH4) :( > I am using this configuration: > pig -version > Apache Pig version 0.10.0-cdh4.1.1 (rexported) > hadoop version > Hadoop 2.0.0-cdh4.1.1 > and getting just the same error as Tim explained: > java.lang.IncompatibleClassChangeError: Found interface > org.apache.hadoop.mapreduce.Counter, but class was expected > > I am running it with the following commands: > REGISTER elephant-bird-pig-3.0.2.jar; > inputData = LOAD 'sample_simple.json' USING > com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]); > DUMP inputData; > > > On Thu, Sep 27, 2012 at 8:48 AM, Dmitriy Ryaboy <[email protected]> > wrote: > > > Yep. It's just JsonLoader. > > By default it works on top of whatever's returned by TexInputFormat, but > > you can override that, as long as the input format returns a string > that's > > valid json, we are cool (so in theory you could write a > > TwitterAPIInputFormat or something, and get the json in Pig, not that I > > would recommend that). > > > > D > > > > On Wed, Sep 26, 2012 at 9:34 PM, Russell Jurney < > [email protected] > > >wrote: > > > > > Does that work without lzo? > > > > > > Russell Jurney http://datasyndrome.com > > > > > > On Sep 26, 2012, at 9:00 PM, Dmitriy Ryaboy <[email protected]> > wrote: > > > > > > > Try asking Michael May on gihub? This seems to be an issue with his > > > Loader.. > > > > > > > > The JsonLoader in ElephantBird should work in this case if you turn > on > > > > nested parsing ( > > > > > > > > > > https://github.com/kevinweil/elephant-bird/blob/master/pig/src/main/java/com/twitter/elephantbird/pig/load/JsonLoader.java > > > > ) > > > > > > > > D > > > > > > > > On Wed, Sep 26, 2012 at 2:31 PM, Deepak Tiwari <[email protected] > > > > > wrote: > > > > > > > >> My bad.. I think I have compiled from > > > >> > https://github.com/mmay/PigJsonLoader/blob/master/JsonLoader.javalong > > > >> time > > > >> back in my piggybank area..it indeed didnt come with the original > > jar... > > > >> > > > >> Regards, > > > >> > > > >> Deepak > > > >> > > > >> On Tue, Sep 25, 2012 at 8:14 AM, Bill Graham <[email protected]> > > > wrote: > > > >> > > > >>> I missed the part about Piggybank, but I'm confused because I don't > > see > > > >>> that class in SVN: > > > >>> > > > >>> > > > >> > > > > > > http://svn.apache.org/viewvc/pig/branches/branch-0.10/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/ > > > >>> > > > >>> Either way your error seems to be issues with parsing the doubles. > > > >>> > > > >>> > > > >>> On Mon, Sep 24, 2012 at 2:24 PM, Vivek Shrivastava < > > > >>> [email protected] > > > >>>> wrote: > > > >>> > > > >>>> Thanks for responding Bill, However I am using JsonLoader that is > in > > > >> the > > > >>>> Piggybank with Pig-0.10.0. > > > >>>> > > > >>>> It doesnt need any schema and converts Json data as map ( > > > >>>> org.apache.pig.piggybank.storage.JsonLoader() as (json:map[]) ) > and > > I > > > >>>> extract data from there using keys. I have processed huge amount > of > > > >> data > > > >>>> without any problem and no schema was required. > > > >>>> > > > >>>> Regards, > > > >>>> > > > >>>> Vivek > > > >>>> > > > >>>> On Mon, Sep 24, 2012 at 2:03 PM, Bill Graham < > [email protected]> > > > >>> wrote: > > > >>>> > > > >>>>> This loader only works for data stored using JsonStorage. From > the > > > >>>>> javadocs: > > > >>>>> > > > >>>>> A loader for data stored using > > > >>>>> JsonStorage< > > > >>>>> > > > >>> > > > >> > > > > > > http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonStorage.html > > > >>>>>> . > > > >>>>> > > > >>>>> This is not a generic JSON loader. It depends on the schema being > > > >> stored > > > >>>>> with the data when conceivably you could write a loader that > > > >> determines > > > >>>>> the > > > >>>>> schema from the JSON. > > > >>>>> > > > >>>>> Was this data produced via JsonStorage? If not, you'll need to > > write > > > a > > > >>>>> custom loader. > > > >>>>> > > > >>>>> On Mon, Sep 24, 2012 at 12:04 PM, Deepak Tiwari < > > > [email protected] > > > >>>>>> wrote: > > > >>>>> > > > >>>>>> Hi, > > > >>>>>> > > > >>>>>> I am try to parse this data using Pig parser > > > >>>>>> org.apache.pig.piggybank.storage.JsonLoader > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>> > > > >> > > > > > > {"geo":{"type":"Polygon","coordinates":[[[-91.3061478,-30.2688069],[-91.012471,-60.2688069],[-91.012471,-69.9306357],[-91.3061478,-29.9306357]]]}, > > > >>>>>> > > > >>>>>> I need to extract this array > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>> > > > >> > > > > > > [[[-91.3061478,-30.2688069],[-91.012471,-60.2688069],[-91.012471,-69.9306357],[-91.3061478,-29.9306357]]] > > > >>>>>> > > > >>>>>> I am getting this error while accessing > flatten(geo#'coordinates') > > > >> , I > > > >>>>>> think that's the limitation ( "only standard Pig type is > > supported") > > > >>> of > > > >>>>> the > > > >>>>>> the parser, but wondering if someone has any workaround > > > >>>>>> > > > >>>>>> "java.lang.RuntimeException: Unexpected data type > > > >>>>>> org.codehaus.jackson.node.DoubleNode found in stream. Note only > > > >>> standard > > > >>>>>> Pig type is supported when you output from UDF/LoadFunc" > > > >>>>>> > > > >>>>>> > > > >>>>>> Thanks very much, > > > >>>>>> > > > >>>>>> Deepak > > > >>>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> -- > > > >>>>> *Note that I'm no longer using my Yahoo! email address. Please > > email > > > >> me > > > >>> at > > > >>>>> [email protected] going forward.* > > > >>>>> > > > >>>> > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> *Note that I'm no longer using my Yahoo! email address. Please > email > > me > > > >> at > > > >>> [email protected] going forward.* > > > >>> > > > >> > > > > > >
