Based on that you need to bucket and
>> index them to get better performance. From a birds eye point of view,
>> bucketing + indexing + map joins would be a good combination if those suits
>> your data set.
>>
>> Regards,
>> Bejoy KS
>>
>> From: Abhi
dheld, please excuse typos.
-Original Message-
From: Abhishek
Date: Fri, 28 Sep 2012 11:14:56
To: Bejoy Ks
Reply-To: user@hive.apache.org
Cc: user@hive.apache.org
Subject: Re: Performance tuning in hive
Hi Bejoy,
How to use CTAS with Clustered By.
I am getting following error when
s
> your data set.
>
> Regards,
> Bejoy KS
>
> From: Abhishek
> To: "user@hive.apache.org"
> Cc: "user@hive.apache.org"
> Sent: Friday, September 28, 2012 5:16 AM
> Subject: Re: Performance tuning in hive
>
> Hi Bejoy,
>
> Th
;
> Cc: "user@hive.apache.org"
> Sent: Friday, September 28, 2012 5:16 AM
> Subject: Re: Performance tuning in hive
>
> Hi Bejoy,
>
> Thanks for the reply.Can I know whether combination of
> 1) Indexing and Bucketing
>Or
> 2) bucketing with Rc file
&g
if those suits your data set.
Regards,
Bejoy KS
From: Abhishek
To: "user@hive.apache.org"
Cc: "user@hive.apache.org"
Sent: Friday, September 28, 2012 5:16 AM
Subject: Re: Performance tuning in hive
Hi Bejoy,
Thanks for the repl
Hi Bejoy,
Thanks for the reply.Can I know whether combination of
1) Indexing and Bucketing
Or
2) bucketing with Rc file
Or
3) sequence file with bucketing and indexing
Or
4) map join with indexes
Or
Any other combination of above mentioned or non mentioned, would fetch a bette
Hi Abshiek
You can have a look at join optimizations as well as group by optimizations
Join optimization - Based on your data sets you can go in with map side join or
bucketed map join or
to enable map join -> set hive.auto.convert.join = true;
to enable bucketed map join -> set hive.optimize.