Hi, We started to work with Avro in CDH4 and to query the Avro files using Hive. This does work fine for us, except for unions. We do not understand how to query the data inside a union using Hive.
For example, let's look at the following schema: { "type":"record", "name":"event", "namespace":"com.mysite", "fields":[ { "name":"eventbody", "type":{ "type":"record", "name":"eventbody", "fields":[ { "name":"body", "type":[ "null", { "type":"record", "name":"event1", "fields":[ { "name":"event1Header", "type":["null", { "type":"array", "items":"string" }], "default":null }, { "name":"event1Body", "type":["null", { "type":"array", "items":"string" }], "default":null } ] }, { "type":"record", "name":"event2", "fields":[ { "name":"page", "type":{ "type":"record", "name":"URL", "fields":[{ "name":"url", "type":"string" }] }, "default":null }, { "name":"referrer", "type":"string", "default":null } ] } ], "default":null } ] }, "default":null } ]} Note that "body" is a union of three types: null, "event1" and "event2" If I run such a query: SELECT eventbody.body from SRC; I get line like this: {2:{"page":{"url":"http://www.musite.com/index.jsp"},"referrer":{"url":" www.search.com"}}} The number "2" in the beginning of the JSON structure represents "events2" union because it is the third element in the union. My question then: If I want to query fields inside event2. E.g., the page.url or the referrer fields how do I construct the select statement? Thank you, Ran -- This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this on behalf of the addressee you must not use, copy, disclose or take action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply email and delete this message. Thank you.