[ 
https://issues.apache.org/jira/browse/PIG-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3627:
----------------------------

    Attachment: PIG-3627-schemageneration.patch

Thanks Sergey. I am fine to check the schema for JsonStorage, but we need to 
treat Null schema as bytearray instead of chararray.

And the real problem here is we shall not get NULL schema in the first place. I 
attach a patch to fix the schema generation.

> Json storage : Doesn't work in cases , where other Store Functions (like 
> PigStorage / AvroStorage) do work. 
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-3627
>                 URL: https://issues.apache.org/jira/browse/PIG-3627
>             Project: Pig
>          Issue Type: Bug
>            Reporter: jay vyas
>         Attachments: PIG-3627-schemageneration.patch, PIG-3627.patch
>
>
> The following query 
> {code:title=Bar.java|borderStyle=solid}
>         pigServer.registerQuery(
>                 "uniqcnt  = foreach transactionsG {"+
>                                "sym = transactions.product ;"+
>                                "dsym = distinct sym  ;"+
>                                "generate flatten(dsym.product) as product, 
> COUNT(dsym) as count ;" +
>                                "};");
> {code} 
> Results in the schema:
> {code} 
>    Schema : {product: NULL,count: long}
> {code}
> This schema, is storable using AvroStorage or PigStorage, but it fails if 
> stored using JsonStorage: 
> {code}
> Failed to parse: <line 1, column 8>  Syntax error, unexpected symbol at or 
> near ','
>       at 
> org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:94)
>       at 
> org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:108)
>       at org.apache.pig.impl.util.Utils.parseSchema(Utils.java:208)
>       at org.apache.pig.impl.util.Utils.getSchemaFromString(Utils.java:182)
>       at 
> org.apache.pig.builtin.JsonStorage.prepareToWrite(JsonStorage.java:140)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:125)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> It appears that JsonStorage is thus less robust than the other storage 
> formats.  Can we confirm or deny if some types of data structures do/ do not 
> work with JsonStorage? 
> So,I suggest:
> 1) Ideally, I would think JsonStorage should support the same data that other 
> Storage functions support.   
> the next best thing: 
> 2) Maybe a wiki page of examples that can / cannot work with JsonStorage 
> and/or a better error message would be sufficient to solve this "bug".



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to