Re: Hive select shows null after successful data load

Stephen Sprague Tue, 18 Jun 2013 17:40:20 -0700

As Nitin alluded to its best to confirm the data is definitely in hdfs
using hdfs semantics rather than hive for the first step.


1. how big is it?   hadoop fs -ls <your hdfs dir>
2. cat a bit of it and see if anything is there.   hadoop fs -text <your
hdfs dir>/<filename> | head -10

do you see any data from step #2?




On Tue, Jun 18, 2013 at 3:58 PM, Sunita Arvind <sunitarv...@gmail.com>wrote:

> I ran some complex queries. Something to the extent of
>                     select jobs from jobs;
>  which triggers map reduce jobs but does not show errors and produces the
> same output "null". If I try referencing the struct elements, I get error
> which seems to be the root cause.
>
> Attached are the select statement outputs with the corresponding hive logs.
>
> I have also attached my usage details of another table - try_parsed which
> has a subset of the same data which seems to work fine. Also attached is
> the input file for this table - try_parsed.json
> Thanks for your help
>
> Sunita
>
>
> On Tue, Jun 18, 2013 at 4:35 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote:
>
>> can you run a little more complex query
>>
>> select uniq across columns or do some maths. so we know when it fires up
>> a mapreduce
>>
>>
>> On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind <sunitarv...@gmail.com>wrote:
>>
>>> Thanks for responding Nitin. Yes I am sure that serde is working fine
>>> and json file is being picked based on all the errors that showed up till
>>> this stage. What sort of error are you suspecting. File not present or
>>> serde not parsing it ?
>>>
>>>
>>> On Tuesday, June 18, 2013, Nitin Pawar wrote:
>>>
>>>> select * from table is as good as hdfs -cat
>>>>
>>>> are you sure there is any data in the table?
>>>>
>>>>
>>>> On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind 
>>>> <sunitarv...@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am able to parse the input JSON file and load it into hive. I do not
>>>>> see any errors with create table, so I am assuming that. But when I try to
>>>>> read the data, I get null
>>>>>
>>>>> hive> select * from jobs;
>>>>> OK
>>>>> null
>>>>>
>>>>> I have validated the JSON with JSONLint and Notepad++ JSON plugin and
>>>>> it is a valid JSON. Here is my create table statement and attached is
>>>>> the json input file.
>>>>>
>>>>> create external table jobs (
>>>>> jobs STRUCT<
>>>>> values : ARRAY<STRUCT<
>>>>> company : STRUCT<
>>>>> id : STRING,
>>>>> name : STRING>,
>>>>> postingDate : STRUCT<
>>>>> year : INT,
>>>>> day : INT,
>>>>> month : INT>,
>>>>> descriptionSnippet : STRING,
>>>>> expirationDate : STRUCT<
>>>>> year : INT,
>>>>> day : INT,
>>>>> month : INT>,
>>>>> position : STRUCT<
>>>>> title : STRING,
>>>>> jobFunctions : ARRAY<STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>>,
>>>>> industries : ARRAY<STRUCT<
>>>>> code : STRING,
>>>>> id : STRING,
>>>>> name : STRING>>,
>>>>> jobType : STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>,
>>>>> experienceLevel : STRUCT<
>>>>> code : STRING,
>>>>> name : STRING>>,
>>>>> id : STRING,
>>>>> customerJobCode : STRING,
>>>>> skillsAndExperience : STRING,
>>>>> salary : STRING,
>>>>> jobPoster : STRUCT<
>>>>> id : STRING,
>>>>> firstName : STRING,
>>>>> lastName : STRING,
>>>>> headline : STRING>,
>>>>> referralBonus : STRING,
>>>>> locationDescription : STRING>>>
>>>>>  )
>>>>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
>>>>> LOCATION '/user/sunita/tables/jobs';
>>>>>
>>>>> The table creation works fine, but when I attempt to query, I get null
>>>>> as the result.
>>>>> I tried adding Input/Output formats, Serde Properties, nothing seems
>>>>> to impact.
>>>>>
>>>>> I am of the opinion that the libraries cannot handle this level of
>>>>> nesting and I probably will have to write a custom serde or a parser
>>>>> myself. Just wanted to seek guidance before I get into that. Appreciate
>>>>> your help and guidance.
>>>>>
>>>>> regards
>>>>> Sunita
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nitin Pawar
>>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>

Re: Hive select shows null after successful data load

Reply via email to