Re: Performance tuning in hive

2012-09-28 Thread Abhishek
Based on that you need to bucket and >> index them to get better performance. From a birds eye point of view, >> bucketing + indexing + map joins would be a good combination if those suits >> your data set. >> >> Regards, >> Bejoy KS >> >> From: Abhi

Re: Performance tuning in hive

2012-09-28 Thread Bejoy KS
dheld, please excuse typos. -Original Message- From: Abhishek Date: Fri, 28 Sep 2012 11:14:56 To: Bejoy Ks Reply-To: user@hive.apache.org Cc: user@hive.apache.org Subject: Re: Performance tuning in hive Hi Bejoy, How to use CTAS with Clustered By. I am getting following error when

Re: Performance tuning in hive

2012-09-28 Thread Abhishek
s > your data set. > > Regards, > Bejoy KS > > From: Abhishek > To: "user@hive.apache.org" > Cc: "user@hive.apache.org" > Sent: Friday, September 28, 2012 5:16 AM > Subject: Re: Performance tuning in hive > > Hi Bejoy, > > Th

Re: Performance tuning in hive

2012-09-28 Thread Abhishek
; > Cc: "user@hive.apache.org" > Sent: Friday, September 28, 2012 5:16 AM > Subject: Re: Performance tuning in hive > > Hi Bejoy, > > Thanks for the reply.Can I know whether combination of > 1) Indexing and Bucketing >Or > 2) bucketing with Rc file &g

Re: Performance tuning in hive

2012-09-28 Thread Bejoy KS
if those suits your data set.   Regards, Bejoy KS From: Abhishek To: "user@hive.apache.org" Cc: "user@hive.apache.org" Sent: Friday, September 28, 2012 5:16 AM Subject: Re: Performance tuning in hive Hi Bejoy, Thanks for the repl

Re: Performance tuning in hive

2012-09-27 Thread Abhishek
Hi Bejoy, Thanks for the reply.Can I know whether combination of 1) Indexing and Bucketing Or 2) bucketing with Rc file Or 3) sequence file with bucketing and indexing Or 4) map join with indexes Or Any other combination of above mentioned or non mentioned, would fetch a bette

Re: Performance tuning in hive

2012-09-27 Thread Bejoy KS
Hi Abshiek You can have a look at join optimizations as well as group by optimizations Join optimization - Based on your data sets you can go in with map side join or bucketed map join or to enable map join -> set hive.auto.convert.join = true; to enable bucketed map join ->  set hive.optimize.