As Nitin alluded to its best to confirm the data is definitely in hdfs
using hdfs semantics rather than hive for the first step.

1. how big is it?   hadoop fs -ls <your hdfs dir>
2. cat a bit of it and see if anything is there.   hadoop fs -text <your
hdfs dir>/<filename> | head -10

do you see any data from step #2?




On Tue, Jun 18, 2013 at 3:58 PM, Sunita Arvind <sunitarv...@gmail.com>wrote:

> I ran some complex queries. Something to the extent of
>                     select jobs from jobs;
>  which triggers map reduce jobs but does not show errors and produces the
> same output "null". If I try referencing the struct elements, I get error
> which seems to be the root cause.
>
> Attached are the select statement outputs with the corresponding hive logs.
>
> I have also attached my usage details of another table - try_parsed which
> has a subset of the same data which seems to work fine. Also attached is
> the input file for this table - try_parsed.json
> Thanks for your help
>
> Sunita
>
>
> On Tue, Jun 18, 2013 at 4:35 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote:
>
>> can you run a little more complex query
>>
>> select uniq across columns or do some maths. so we know when it fires up
>> a mapreduce
>>
>>
>> On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind <sunitarv...@gmail.com>wrote:
>>
>>> Thanks for responding Nitin. Yes I am sure that serde is working fine
>>> and json file is being picked based on all the errors that showed up till
>>> this stage. What sort of error are you suspecting. File not present or
>>> serde not parsing it ?
>>>
>>>
>>> On Tuesday, June 18, 2013, Nitin Pawar wrote:
>>>
>>>> select * from table is as good as hdfs -cat
>>>>
>>>> are you sure there is any data in the table?
>>>>
>>>>
>>>> On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind 
>>>> <sunitarv...@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am able to parse the input JSON file and load it into hive. I do not
>>>>> see any errors with create table, so I am assuming that. But when I try to
>>>>> read the data, I get null
>>>>>
>>>>> hive> select * from jobs;
>>>>> OK
>>>>> null
>>>>>
>>>>> I have validated the JSON with JSONLint and Notepad++ JSON plugin and
>>>>> it is a valid JSON. Here is my create table statement and attached is
>>>>> the json input file.
>>>>>
>>>>> create external table jobs (
>>>>> jobs STRUCT<
>>>>> values : ARRAY<STRUCT<
>>>>> company : STRUCT<
>>>>> id : STRING,
>>>>> name : STRING>,
>>>>> postingDate : STRUCT<
>>>>> year : INT,
>>>>> day : INT,
>>>>> month : INT>,
>>>>> descriptionSnippet : STRING,
>>>>> expirationDate : STRUCT<
>>>>> year : INT,
>>>>> day : INT,
>>>>> month : INT>,
>>>>> position : STRUCT<
>>>>> title : STRING,
>>>>> jobFunctions : ARRAY<STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>>,
>>>>> industries : ARRAY<STRUCT<
>>>>> code : STRING,
>>>>> id : STRING,
>>>>> name : STRING>>,
>>>>> jobType : STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>,
>>>>> experienceLevel : STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>>,
>>>>> id : STRING,
>>>>> customerJobCode : STRING,
>>>>> skillsAndExperience : STRING,
>>>>> salary : STRING,
>>>>> jobPoster : STRUCT<
>>>>> id : STRING,
>>>>> firstName : STRING,
>>>>> lastName : STRING,
>>>>> headline : STRING>,
>>>>> referralBonus : STRING,
>>>>> locationDescription : STRING>>>
>>>>>  )
>>>>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
>>>>> LOCATION '/user/sunita/tables/jobs';
>>>>>
>>>>> The table creation works fine, but when I attempt to query, I get null
>>>>> as the result.
>>>>> I tried adding Input/Output formats, Serde Properties, nothing seems
>>>>> to impact.
>>>>>
>>>>> I am of the opinion that the libraries cannot handle this level of
>>>>> nesting and I probably will have to write a custom serde or a parser
>>>>> myself. Just wanted to seek guidance before I get into that. Appreciate
>>>>> your help and guidance.
>>>>>
>>>>> regards
>>>>> Sunita
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nitin Pawar
>>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>

Reply via email to