Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
Aggarwal wrote: > Hello Sam, > You can easily achieve this by using elephant-bird.jars in pig. We are > also caturing tweets via flume and filter them using pig and elephant-jars. > You can find the related jars over internet. > > Cheers, > Nishant Aggarwal > On 28 Oct 2015

Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
working query below > and then work on getting the lateral view explode to work against the temp > table. > > > > FAILED: UDFArgumentException explode() takes an array or a map as a > parameter > > Apparently, hive doesn't think tr3.media is an array or map..s

Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
e_user_id_str":"16864598","indices":[143,144],"source_status_id_str":"654301626665189376","source_status_id":654301626665189376,"id_str":"654301608994586624"},{"sizes":{"thumb":{"w":150,"resize

Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
> > SELECT get_json_object(text_col, '$.id') as id FROM tweets_raw limit 10; > > > > You should also be able to use json_tuple(), but start simple > > > > *From:* Sam Joe [mailto:games2013@gmail.com] > *Sent:* Tuesday, October 27, 2015 1:43 PM > &

Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
as in these examples: > > > http://mechanics.flite.com/blog/2014/04/16/using-explode-and-lateral-view-in-hive/ > > > http://stackoverflow.com/questions/28716165/how-to-query-struct-array-with-hive-get-json-object > > > > > > *From:* Sam Joe [mailto:games2013@gmai

Re: Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
I tried using EXPLODE function on the nested json array but it doesn't work and throws following error: FAILED: UDFArgumentException explode() takes an array or a map as a parameter Thanks, Joel On Tue, Oct 27, 2015 at 3:20 PM, Sam Joe wrote: > Hi, > > Is it possible to

Using json_tuple for Nested json Arrays

2015-10-27 Thread Sam Joe
Hi, Is it possible to use json_tuple function to extract data from json arrays (nested too). I am trying to process json data as string and avoid using serdes since user data may be malformed. Please see a sample json data given below: { "filter_level": "low", "retweeted": false, "in_reply_t

Reading JSON data & org.apache.hadoop.hive.contrib.serde2.JsonSerde

2015-10-23 Thread Sam Joe
Hi, Does *org.apache.hadoop.hive.contrib.serde2.JsonSerde* come with features of reading nested data? Also, could you please help me with a location to download the jar for: *org.apache.hadoop.hive.contrib.serde2.JsonSerde*? Appreciate your help! Thanks, Joel

Re: Need suggestions on processing JSON junk (e.g., invalid double quotes) data using HIVE

2015-10-22 Thread Sam Joe
aus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521) at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:454) at org.codehaus.jackson.impl.ReaderBasedParser._parseFieldName2(ReaderBasedParser.java:1025)

Need suggestions on processing JSON junk (e.g., invalid double quotes) data using HIVE

2015-10-22 Thread Sam Joe
Hi, After streaming twitter data to HDFS using Flume, I'm trying to analyze it using some HIVE queries. The data is in JSON format and not clean having double quotes (") in wrong places causing the HIVE queries to fail. I am getting the following error: Failed with exception java.io.IOException:o