Re: Skew Join Optimization in hive

2011-06-07 Thread Shantian Purkad
We have given hints to use mapside joins on small tables. We are planning to break this query into multiple, but would prefer options that help us keep the queries as is (with few modifications and tuning instead of breaking the queries into multiple steps as there is quite bit of complicate l

Re: Skew Join Optimization in hive

2011-06-07 Thread Igor Tatarinov
Have you tried splitting the query into 2 or 3 steps and/or enabling map jons (SET hive.auto.convert.join = true;) if some of the tables are smallish? On Tue, Jun 7, 2011 at 12:31 PM, Shantian Purkad wrote: > Hi, > > I have a query which joins 12 different tables (most of them left outer > joins

Skew Join Optimization in hive

2011-06-07 Thread Shantian Purkad
Hi, I have a query which joins 12 different tables (most of them left outer joins) and the query takes almost 3 hours. 90% of the time is taken by a single reducer. One reducer is getting bulk of the data to process. How can I get around this and have fair distribution of data across all reduc