Yes, Bejoy,

it was the data. I also have to be strict with GROUP BY and not give it any
fields with an aggregator function (unlike MySQL).

Thank you,
Mark

On Wed, Oct 19, 2011 at 11:54 AM, <bejoy...@yahoo.com> wrote:

> ** Looks like some data problem. Were you using the GROUP BY query on same
> data set?
> But if count(*) also throws an error then it comes to square 1,
> installation/configuration problem with hive or map reduce.
>
> Regards
> Bejoy K S
> ------------------------------
> *From: * Mark Kerzner <mark.kerz...@shmsoft.com>
> *Date: *Wed, 19 Oct 2011 10:55:34 -0500
> *To: *<user@hive.apache.org>; <bejoy...@yahoo.com>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Re: Hive query failing on group by
>
> Bejoy,
>
> I've been using this install of Hive for some time now, and simple queries
> and joins work fine. It's the GROUP BY that I have problems with, sometimes
> even with COUNT(*).
>
> I am trying to isolate the problem now, and reduce it to the smallest query
> possible. I am also trying to find a workaround (I noticed that sometimes
> rephrasing queries for Hive helps), since I need this for a project.
>
> Thank you,
> Mark
>
> On Wed, Oct 19, 2011 at 10:25 AM, <bejoy...@yahoo.com> wrote:
>
>> ** Mark
>> To ensure your hive installation is fine run two queries
>> SELECT * FROM trans LIMIT 10;
>> SELECT * FROM trans WHERE ***;
>> You can try this for couple of different tables. If these queries return
>> results and work fine as desired then your hive could be working good.
>>
>> If it works good as the second step issue a simple join between two tables
>> on primitive data type columns. If that also looks good then you can kind of
>> confirm that the bug is with your hive query.
>>
>> We can look into that direction then.
>>
>>
>>
>> Regards
>> Bejoy K S
>> ------------------------------
>> *From: * Mark Kerzner <mark.kerz...@shmsoft.com>
>> *Date: *Wed, 19 Oct 2011 10:02:57 -0500
>> *To: *<user@hive.apache.org>
>> *ReplyTo: * user@hive.apache.org
>> *Subject: *Re: Hive query failing on group by
>>
>> Vikas,
>>
>> I am using Cloudera CDHU1 on Ubuntu. I get the same results on RedHat
>> CDHU0.
>>
>> Mark
>>
>> On Wed, Oct 19, 2011 at 9:47 AM, Vikas Srivastava <
>> vikas.srivast...@one97.net> wrote:
>>
>>> install hive with RPM this is correpted!!!!!!
>>>
>>> On Wed, Oct 19, 2011 at 8:01 PM, Mark Kerzner 
>>> <mark.kerz...@shmsoft.com>wrote:
>>>
>>>> Here is what my hive logs say
>>>>
>>>> hive -hiveconf hive.root.logger=DEBUG
>>>>
>>>> 2011-10-19 09:24:35,148 ERROR DataNucleus.Plugin
>>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
>>>> "org.eclipse.core.resources" but it cannot be resolved.
>>>> 2011-10-19 09:24:35,150 ERROR DataNucleus.Plugin
>>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
>>>> "org.eclipse.core.runtime" but it cannot be resolved.
>>>> 2011-10-19 09:24:35,150 ERROR DataNucleus.Plugin
>>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
>>>> "org.eclipse.text" but it cannot be resolved.
>>>>
>>>>
>>>> On Wed, Oct 19, 2011 at 9:21 AM, <bejoy...@yahoo.com> wrote:
>>>>
>>>>> ** Hi Mark
>>>>> What does your Map reduce job logs say? Try figuring out the error form
>>>>> there. From hive CLI you could hardly find out the root cause of your
>>>>> errors. From job tracker web UI < http://hostname:50030/jobtracker.jsp>
>>>>> you can easily browse to failed tasks and get the actual exception from
>>>>> there. If you are not able to figure out from there then please post in
>>>>> those logs with your table schema.
>>>>>
>>>>> Regards
>>>>> Bejoy K S
>>>>> ------------------------------
>>>>> *From: * Mark Kerzner <mark.kerz...@shmsoft.com>
>>>>> *Date: *Wed, 19 Oct 2011 09:06:13 -0500
>>>>> *To: *Hive user<user@hive.apache.org>
>>>>> *ReplyTo: * user@hive.apache.org
>>>>> *Subject: *Hive query failing on group by
>>>>>
>>>>> HI,
>>>>>
>>>>> I am trying to figure out what I am doing wrong with this query and the
>>>>> unusual error I am getting. Also suspicious is the reduce % going up and
>>>>> down.
>>>>>
>>>>> select trans.property_id, day(trans.log_timestamp) from trans JOIN opts
>>>>> on trans.ext_booking_id["ext_booking_id"] = opts.ext_booking_id group by
>>>>> day(trans.log_timestamp), trans.property_id;
>>>>>
>>>>> 2011-10-19 08:55:19,778 Stage-1 map = 0%,  reduce = 0%
>>>>> 2011-10-19 08:55:22,786 Stage-1 map = 100%,  reduce = 0%
>>>>> 2011-10-19 08:55:29,804 Stage-1 map = 100%,  reduce = 33%
>>>>> 2011-10-19 08:55:32,811 Stage-1 map = 100%,  reduce = 0%
>>>>> 2011-10-19 08:55:39,829 Stage-1 map = 100%,  reduce = 33%
>>>>> 2011-10-19 08:55:43,839 Stage-1 map = 100%,  reduce = 0%
>>>>> 2011-10-19 08:55:50,855 Stage-1 map = 100%,  reduce = 33%
>>>>> 2011-10-19 08:55:54,864 Stage-1 map = 100%,  reduce = 0%
>>>>> 2011-10-19 08:56:00,878 Stage-1 map = 100%,  reduce = 33%
>>>>> 2011-10-19 08:56:04,887 Stage-1 map = 100%,  reduce = 0%
>>>>> 2011-10-19 08:56:05,891 Stage-1 map = 100%,  reduce = 100%
>>>>> Ended Job = job_201110111849_0024 with errors
>>>>> FAILED: Execution Error, return code 2 from
>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask
>>>>>
>>>>> Thank you,
>>>>> Mark
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> With Regards
>>> Vikas Srivastava
>>>
>>> DWH & Analytics Team
>>> Mob:+91 9560885900
>>> One97 | Let's get talking !
>>>
>>>
>>
>

Reply via email to