Thanks Ed. Let me try a few more iterations. Somehow I am not doing this correctly :-) regards sanjay From: Edward Capriolo <[email protected]> To: "[email protected]" <[email protected]>; Sanjay Subramanian <[email protected]> Sent: Sunday, January 25, 2015 8:11 AM Subject: Re: Hive JSON Serde question Nested lists require nested lateral views.
On Sun, Jan 25, 2015 at 11:02 AM, Sanjay Subramanian <[email protected]> wrote: hey guys This is the Hive table definition I have created based on the JSON I am using this version of hive json serde https://github.com/rcongiu/Hive-JSON-Serde ADD JAR /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar;DROP TABLE IF EXISTS datafeed_json;CREATE EXTERNAL TABLE IF NOT EXISTS datafeed_json ( object STRING, entry array <struct <id:STRING, time:BIGINT, changes:array <struct <field:STRING, value:struct <item:STRING, verb:STRING, parent_id:STRING, sender_id:BIGINT, created_time:BIGINT>>>>>) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' STORED AS TEXTFILE LOCATION '/data/sanjay/datafeed'; QUERY 1=======ADD JAR /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar; SELECT object, entry[0].id, entry[0].time, entry[0].changes[0].field, entry[0].changes[0].value.item, entry[0].changes[0].value.verb, entry[0].changes[0].value.parent_id, entry[0].changes[0].value.sender_id, entry[0].changes[0].value.created_time FROM datafeed_json; RESULT1======foo123 113621765320467 1418608223 leads song1 rock 113621765320467_1107142375968396 100004748082019 1418608223 QUERY2======ADD JAR /home/sanjay/mycode/jar/jsonserde/json-serde-1.3.1-SNAPSHOT-jar-with-dependencies.jar; SELECT object, entry.id, entry.time, ntry FROM datafeed_json LATERAL VIEW EXPLODE (datafeed_json.entry.changes) oc1 AS ntry; RESULT2=======This gives 4 rows but I was not able to iteratively do the LATERAL VIEW EXPLODE I tried various combinations of LATERAL VIEW , LATERAL VIEW EXPLODE, json_tuple to extract all fields in an exploded view from the JSON in tab separated format but no luck. Any thoughts ? Thanks sanjay
