[ https://issues.apache.org/jira/browse/HIVE-25188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355908#comment-17355908 ]
David Mollitor edited comment on HIVE-25188 at 6/2/21, 6:29 PM: ---------------------------------------------------------------- [~dengzh] I've formatted the JSON to make it easier to read for discussion sake. FYI, there are a few stray characters at the end of your example that were giving me issues during formatting. {code:json} { "data": { "H": { "event": "track_active", "platform": "Android" }, "B": { "device_type": "Phone", "uuid": "[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]" } }, "messageId": "2475185636801962", "publish_time": 1622514629783, "attributes": { "region": "IN" } } {code} create table json_table(data string, messageid string, publish_time bigint, attributes string); The {{data}} field is not a String type. It is itself a data type of type struct. If you intend to do something like stuffing arbitrary data in that field, then "data" should be a Base-64 string and then you can declare it as a Binary type in Hive. I think that's the preferred approach instead of just allowing an overloaded String type. If you need to parse/query specific data from there, you would un-base64 it and use the {{get_json_object}} or {{json_tuple}} UDFs to read it. was (Author: belugabehr): [~dengzh] I've formatted the JSON to make it easier to read for discussion sake. FYI, there are a few stray characters at the end of your example that were giving me issues during formatting. {code:json} { "data": { "H": { "event": "track_active", "platform": "Android" }, "B": { "device_type": "Phone", "uuid": "[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]" } }, "messageId": "2475185636801962", "publish_time": 1622514629783, "attributes": { "region": "IN" } } {code} create table json_table(data string, messageid string, publish_time bigint, attributes string); The {{data}} field is not a String type. It is itself a data type of type struct. If you intend to do something like stuffing arbitrary data in that field, then "data" should be a Base-64 string and then you can declare it as a Binary type in Hive. I think that's the preferred approach instead of just allowing an overloaded String type. > JsonSerDe: Unable to read the string value from a nested json > ------------------------------------------------------------- > > Key: HIVE-25188 > URL: https://issues.apache.org/jira/browse/HIVE-25188 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 4.0.0 > Reporter: Zhihua Deng > Assignee: Zhihua Deng > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Steps to reproduce: > create table json_table(data string, messageid string, publish_time bigint, > attributes string); > > if the data of the table stored like: > {code:java} > {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}{code} > Exception will be thrown when trying to deserialize the data: > > Caused by: java.lang.IllegalArgumentException > at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108) > at > org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitLeafNode(HiveJsonReader.java:374) > at > org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:216) > at > org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitStructNode(HiveJsonReader.java:327) > at > org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:221) > at > org.apache.hadoop.hive.serde2.json.HiveJsonReader.parseStruct(HiveJsonReader.java:198) > at org.apache.hadoop.hive.serde2.JsonSerDe.deserialize(JsonSerDe.java:181) -- This message was sent by Atlassian Jira (v8.3.4#803005)