Re: Hive select shows null after successful data load

2013-06-21 Thread Sunita Arvind
Yes values is the outer most array. Probably array < struct < struct is the max level of nesting possible. Any number of structs can be nested, but internal arrays seem to be an issue. The ones that failed had, array< struct < struct < array < struct. This broke the serde. Regarding pretty printin

Re: Hive select shows null after successful data load

2013-06-20 Thread Stephen Sprague
hooray! over one hurdle and onto the next one. So something about that one nested array caused the problem. very strange. I wonder if there is a smaller test case to look at as it seems not all arrays break it since i see one for the attribute "values". As to the formatting issue i don't beli

Re: Hive select shows null after successful data load

2013-06-19 Thread Sunita Arvind
Finally I could get it work. The issue resolves once I remove the arrays within position structure. So that is the limitation of the serde. I changed 'industries' to string and 'jobfunctions' to Map I can query the table just fine now. Here is the complete DDL for reference: create external table

Re: Hive select shows null after successful data load

2013-06-19 Thread Sunita Arvind
Thanks Stephen, Let me explore options. I will let you all know once I am successful. regards Sunita On Wed, Jun 19, 2013 at 3:08 PM, Stephen Sprague wrote: > try_parsed_json is not trivial imho :) > > start with the very, very basic, for example, { "jobs" : "foo" }. Get > that to work first

Re: Hive select shows null after successful data load

2013-06-19 Thread Stephen Sprague
try_parsed_json is not trivial imho :) start with the very, very basic, for example, { "jobs" : "foo" }. Get that to work first. :) When that works add a level of nesting and see what happens. Keep building on it until you either break it (and then you know that last thing you added broke it

Re: Hive select shows null after successful data load

2013-06-19 Thread Sunita Arvind
Thanks for looking into it Ramki. Yes I had tried these options. Here is what I get (renamed the table to have a meaningful name): hive> select jobs.values[1].id from linkedinjobsearch; ..mapreduce task details OK NULL Time taken: 9.586 seconds hive> select jobs.values[0].position.title

Re: Hive select shows null after successful data load

2013-06-19 Thread Ramki Palle
Can you run some other queries from job1 table and see if any query returns some data? I am guessing your query "select jobs.values.position.title from jobs1;" may have some issue. May be it should be as select jobs.values[0].position.title from jobs1; Regards, Ramki. On Wed, Jun 19, 2013 at

Re: Hive select shows null after successful data load

2013-06-19 Thread Sunita Arvind
Thanks Stephen, That's just what I tried with the try_parsed table. It is exactly same data with lesser nesting in the structure and lesser number of entries. Do you mean to say that highly nested jsons can lead to issues? What are typical solution to such issues? Write UDFs in hive or parse the J

Re: Hive select shows null after successful data load

2013-06-19 Thread Stephen Sprague
I think you might have to start small here instead of going for the home run on the first swing. when all else fails start with a trivial json object and then build up from there and see what additional step breaks it. that way you know if the trivial example fails is something fundamental and n

Re: Hive select shows null after successful data load

2013-06-19 Thread Sunita Arvind
Thanks for sharing your experience Richa. I do have timestamps but in the format of year : INT, day : INT, month : INT. As per your suggestion, I changed them all to string, but still get null as the output. regards Sunita On Wed, Jun 19, 2013 at 2:17 AM, Richa Sharma wrote: > Do you have any t

Re: Hive select shows null after successful data load

2013-06-18 Thread Richa Sharma
Do you have any timestamp fields in the table that might contain null value ? I faced a similar situation sometime back - changing the data type to string made it work. But I was working on delimited text files. Not sure if it applies to JSON .. but its still worth giving a try !! Richa On We

Re: Hive select shows null after successful data load

2013-06-18 Thread Sunita Arvind
Having the a column name same as the table name, is a problem due to which I was not able to reference jobs.values.id from jobs. Changing the table name to jobs1 resolved the semantic error. However, the query still returns null hive> select jobs.values.position.title from jobs1; Total MapReduce j

Re: Hive select shows null after successful data load

2013-06-18 Thread Sunita Arvind
Ok. The data files are quite small. Around 35 KB and 1 KB each. [sunita@node01 tables]$ hadoop fs -ls /user/sunita/tables/jobs Found 1 items -rw-r--r-- 3 sunita hdfs 35172 2013-06-18 18:31 /user/sunita/tables/jobs/jobs_noSite_parsed.json [sunita@node01 tables]$ hadoop fs -text /user/suni

Re: Hive select shows null after successful data load

2013-06-18 Thread Stephen Sprague
As Nitin alluded to its best to confirm the data is definitely in hdfs using hdfs semantics rather than hive for the first step. 1. how big is it? hadoop fs -ls 2. cat a bit of it and see if anything is there. hadoop fs -text / | head -10 do you see any data from step #2? On Tue, Jun 18,

Re: Hive select shows null after successful data load

2013-06-18 Thread Nitin Pawar
can you run a little more complex query select uniq across columns or do some maths. so we know when it fires up a mapreduce On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind wrote: > Thanks for responding Nitin. Yes I am sure that serde is working fine and > json file is being picked based on all

Re: Hive select shows null after successful data load

2013-06-18 Thread Sunita Arvind
Thanks for responding Nitin. Yes I am sure that serde is working fine and json file is being picked based on all the errors that showed up till this stage. What sort of error are you suspecting. File not present or serde not parsing it ? On Tuesday, June 18, 2013, Nitin Pawar wrote: > select * fr

Re: Hive select shows null after successful data load

2013-06-18 Thread Nitin Pawar
select * from table is as good as hdfs -cat are you sure there is any data in the table? On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind wrote: > Hi, > > I am able to parse the input JSON file and load it into hive. I do not see > any errors with create table, so I am assuming that. But when I

Hive select shows null after successful data load

2013-06-18 Thread Sunita Arvind
Hi, I am able to parse the input JSON file and load it into hive. I do not see any errors with create table, so I am assuming that. But when I try to read the data, I get null hive> select * from jobs; OK null I have validated the JSON with JSONLint and Notepad++ JSON plugin and it is a valid JS