This seems like a bug to me. It makes it risky to work with JSON data generated by something other than Pig since the ordering might change. What do you think?
I didn't see a bug for it in Jira, so would this (still open) one be the place to mention it? Or should I make a new one? https://issues.apache.org/jira/browse/PIG-1914 ~T On 7 January 2013 20:24, Alan Gates <[email protected]> wrote: > Currently the JsonLoader does assume ordering of the fields. It does not do > any name matching against the given schema to find the right field. > > Alan. > > On Jan 7, 2013, at 11:56 AM, Tim Sell wrote: > >> When using JsonLoader with Pig 0.10.0 >> >> if I have an input.json file that looks like this: >> >> {"date": "2007-08-25", "id": 16} >> {"date": "2007-09-08", "id": 17} >> {"date": "2007-09-15", "id": 18} >> >> And I use >> >> a = LOAD 'input.json' USING JsonLoader('id:int,date:chararray'); >> DUMP a; >> >> I get errors when it tries to force the date fields into an integer. >> >> Shouldn't this work independent of the ordering of the schema fields? >> Json writers generally don't make guarantees about the ordering. >> >> One alternative (though annoying) would to be use elephant bird >> instead, but I can't get that to compile against hadoop 2.0.0 and Pig >> 0.10.0. >> >> ~Tim >
