As Nitin alluded to its best to confirm the data is definitely in hdfs using hdfs semantics rather than hive for the first step.
1. how big is it? hadoop fs -ls <your hdfs dir> 2. cat a bit of it and see if anything is there. hadoop fs -text <your hdfs dir>/<filename> | head -10 do you see any data from step #2? On Tue, Jun 18, 2013 at 3:58 PM, Sunita Arvind <sunitarv...@gmail.com>wrote: > I ran some complex queries. Something to the extent of > select jobs from jobs; > which triggers map reduce jobs but does not show errors and produces the > same output "null". If I try referencing the struct elements, I get error > which seems to be the root cause. > > Attached are the select statement outputs with the corresponding hive logs. > > I have also attached my usage details of another table - try_parsed which > has a subset of the same data which seems to work fine. Also attached is > the input file for this table - try_parsed.json > Thanks for your help > > Sunita > > > On Tue, Jun 18, 2013 at 4:35 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> can you run a little more complex query >> >> select uniq across columns or do some maths. so we know when it fires up >> a mapreduce >> >> >> On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind <sunitarv...@gmail.com>wrote: >> >>> Thanks for responding Nitin. Yes I am sure that serde is working fine >>> and json file is being picked based on all the errors that showed up till >>> this stage. What sort of error are you suspecting. File not present or >>> serde not parsing it ? >>> >>> >>> On Tuesday, June 18, 2013, Nitin Pawar wrote: >>> >>>> select * from table is as good as hdfs -cat >>>> >>>> are you sure there is any data in the table? >>>> >>>> >>>> On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind >>>> <sunitarv...@gmail.com>wrote: >>>> >>>>> Hi, >>>>> >>>>> I am able to parse the input JSON file and load it into hive. I do not >>>>> see any errors with create table, so I am assuming that. But when I try to >>>>> read the data, I get null >>>>> >>>>> hive> select * from jobs; >>>>> OK >>>>> null >>>>> >>>>> I have validated the JSON with JSONLint and Notepad++ JSON plugin and >>>>> it is a valid JSON. Here is my create table statement and attached is >>>>> the json input file. >>>>> >>>>> create external table jobs ( >>>>> jobs STRUCT< >>>>> values : ARRAY<STRUCT< >>>>> company : STRUCT< >>>>> id : STRING, >>>>> name : STRING>, >>>>> postingDate : STRUCT< >>>>> year : INT, >>>>> day : INT, >>>>> month : INT>, >>>>> descriptionSnippet : STRING, >>>>> expirationDate : STRUCT< >>>>> year : INT, >>>>> day : INT, >>>>> month : INT>, >>>>> position : STRUCT< >>>>> title : STRING, >>>>> jobFunctions : ARRAY<STRUCT< >>>>> code : STRING, >>>>> name : STRING>>, >>>>> industries : ARRAY<STRUCT< >>>>> code : STRING, >>>>> id : STRING, >>>>> name : STRING>>, >>>>> jobType : STRUCT< >>>>> code : STRING, >>>>> name : STRING>, >>>>> experienceLevel : STRUCT< >>>>> code : STRING, >>>>> name : STRING>>, >>>>> id : STRING, >>>>> customerJobCode : STRING, >>>>> skillsAndExperience : STRING, >>>>> salary : STRING, >>>>> jobPoster : STRUCT< >>>>> id : STRING, >>>>> firstName : STRING, >>>>> lastName : STRING, >>>>> headline : STRING>, >>>>> referralBonus : STRING, >>>>> locationDescription : STRING>>> >>>>> ) >>>>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' >>>>> LOCATION '/user/sunita/tables/jobs'; >>>>> >>>>> The table creation works fine, but when I attempt to query, I get null >>>>> as the result. >>>>> I tried adding Input/Output formats, Serde Properties, nothing seems >>>>> to impact. >>>>> >>>>> I am of the opinion that the libraries cannot handle this level of >>>>> nesting and I probably will have to write a custom serde or a parser >>>>> myself. Just wanted to seek guidance before I get into that. Appreciate >>>>> your help and guidance. >>>>> >>>>> regards >>>>> Sunita >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >> >> >> -- >> Nitin Pawar >> > >