Yes, Bejoy, it was the data. I also have to be strict with GROUP BY and not give it any fields with an aggregator function (unlike MySQL).
Thank you, Mark On Wed, Oct 19, 2011 at 11:54 AM, <bejoy...@yahoo.com> wrote: > ** Looks like some data problem. Were you using the GROUP BY query on same > data set? > But if count(*) also throws an error then it comes to square 1, > installation/configuration problem with hive or map reduce. > > Regards > Bejoy K S > ------------------------------ > *From: * Mark Kerzner <mark.kerz...@shmsoft.com> > *Date: *Wed, 19 Oct 2011 10:55:34 -0500 > *To: *<user@hive.apache.org>; <bejoy...@yahoo.com> > *ReplyTo: * user@hive.apache.org > *Subject: *Re: Hive query failing on group by > > Bejoy, > > I've been using this install of Hive for some time now, and simple queries > and joins work fine. It's the GROUP BY that I have problems with, sometimes > even with COUNT(*). > > I am trying to isolate the problem now, and reduce it to the smallest query > possible. I am also trying to find a workaround (I noticed that sometimes > rephrasing queries for Hive helps), since I need this for a project. > > Thank you, > Mark > > On Wed, Oct 19, 2011 at 10:25 AM, <bejoy...@yahoo.com> wrote: > >> ** Mark >> To ensure your hive installation is fine run two queries >> SELECT * FROM trans LIMIT 10; >> SELECT * FROM trans WHERE ***; >> You can try this for couple of different tables. If these queries return >> results and work fine as desired then your hive could be working good. >> >> If it works good as the second step issue a simple join between two tables >> on primitive data type columns. If that also looks good then you can kind of >> confirm that the bug is with your hive query. >> >> We can look into that direction then. >> >> >> >> Regards >> Bejoy K S >> ------------------------------ >> *From: * Mark Kerzner <mark.kerz...@shmsoft.com> >> *Date: *Wed, 19 Oct 2011 10:02:57 -0500 >> *To: *<user@hive.apache.org> >> *ReplyTo: * user@hive.apache.org >> *Subject: *Re: Hive query failing on group by >> >> Vikas, >> >> I am using Cloudera CDHU1 on Ubuntu. I get the same results on RedHat >> CDHU0. >> >> Mark >> >> On Wed, Oct 19, 2011 at 9:47 AM, Vikas Srivastava < >> vikas.srivast...@one97.net> wrote: >> >>> install hive with RPM this is correpted!!!!!! >>> >>> On Wed, Oct 19, 2011 at 8:01 PM, Mark Kerzner >>> <mark.kerz...@shmsoft.com>wrote: >>> >>>> Here is what my hive logs say >>>> >>>> hive -hiveconf hive.root.logger=DEBUG >>>> >>>> 2011-10-19 09:24:35,148 ERROR DataNucleus.Plugin >>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires >>>> "org.eclipse.core.resources" but it cannot be resolved. >>>> 2011-10-19 09:24:35,150 ERROR DataNucleus.Plugin >>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires >>>> "org.eclipse.core.runtime" but it cannot be resolved. >>>> 2011-10-19 09:24:35,150 ERROR DataNucleus.Plugin >>>> (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires >>>> "org.eclipse.text" but it cannot be resolved. >>>> >>>> >>>> On Wed, Oct 19, 2011 at 9:21 AM, <bejoy...@yahoo.com> wrote: >>>> >>>>> ** Hi Mark >>>>> What does your Map reduce job logs say? Try figuring out the error form >>>>> there. From hive CLI you could hardly find out the root cause of your >>>>> errors. From job tracker web UI < http://hostname:50030/jobtracker.jsp> >>>>> you can easily browse to failed tasks and get the actual exception from >>>>> there. If you are not able to figure out from there then please post in >>>>> those logs with your table schema. >>>>> >>>>> Regards >>>>> Bejoy K S >>>>> ------------------------------ >>>>> *From: * Mark Kerzner <mark.kerz...@shmsoft.com> >>>>> *Date: *Wed, 19 Oct 2011 09:06:13 -0500 >>>>> *To: *Hive user<user@hive.apache.org> >>>>> *ReplyTo: * user@hive.apache.org >>>>> *Subject: *Hive query failing on group by >>>>> >>>>> HI, >>>>> >>>>> I am trying to figure out what I am doing wrong with this query and the >>>>> unusual error I am getting. Also suspicious is the reduce % going up and >>>>> down. >>>>> >>>>> select trans.property_id, day(trans.log_timestamp) from trans JOIN opts >>>>> on trans.ext_booking_id["ext_booking_id"] = opts.ext_booking_id group by >>>>> day(trans.log_timestamp), trans.property_id; >>>>> >>>>> 2011-10-19 08:55:19,778 Stage-1 map = 0%, reduce = 0% >>>>> 2011-10-19 08:55:22,786 Stage-1 map = 100%, reduce = 0% >>>>> 2011-10-19 08:55:29,804 Stage-1 map = 100%, reduce = 33% >>>>> 2011-10-19 08:55:32,811 Stage-1 map = 100%, reduce = 0% >>>>> 2011-10-19 08:55:39,829 Stage-1 map = 100%, reduce = 33% >>>>> 2011-10-19 08:55:43,839 Stage-1 map = 100%, reduce = 0% >>>>> 2011-10-19 08:55:50,855 Stage-1 map = 100%, reduce = 33% >>>>> 2011-10-19 08:55:54,864 Stage-1 map = 100%, reduce = 0% >>>>> 2011-10-19 08:56:00,878 Stage-1 map = 100%, reduce = 33% >>>>> 2011-10-19 08:56:04,887 Stage-1 map = 100%, reduce = 0% >>>>> 2011-10-19 08:56:05,891 Stage-1 map = 100%, reduce = 100% >>>>> Ended Job = job_201110111849_0024 with errors >>>>> FAILED: Execution Error, return code 2 from >>>>> org.apache.hadoop.hive.ql.exec.MapRedTask >>>>> >>>>> Thank you, >>>>> Mark >>>>> >>>> >>>> >>> >>> >>> -- >>> With Regards >>> Vikas Srivastava >>> >>> DWH & Analytics Team >>> Mob:+91 9560885900 >>> One97 | Let's get talking ! >>> >>> >> >