Re: why 1 reducer on simple join?

Wojciech Langiewicz Thu, 12 Jan 2012 15:28:53 -0800

I ment this query (without create table....):
select x.* from table1 x join table2 y where (
x.col1 = y.col1 and
x.col2 = y.col2 and
x.col3 = y.col3 and
x.col4 = y.col4 and
x.col5 = y.col5
);

this document might be useful:https://cwiki.apache.org/Hive/joinoptimization.html


Especially try this setting:
set hive.auto.convert.join = true; (or false)

Which version of Hive are you using?

On 13.01.2012 00:24, Koert Kuipers wrote:

hive>  set mapred.reduce.tasks = 3;
hive>  select count(*) from table1 group by column1 limit 10;
query runs with 38 mappers and 3 reducers

hive>  select count(*) from table2 group by column1 limit 10;
query runs with 6 mappers and 3 reducers

On Thu, Jan 12, 2012 at 6:09 PM, Wojciech Langiewicz
<wlangiew...@gmail.com>wrote:

What do you mean by "Select runs fine" - is it using number of reducers
that you set?
It might help if you could show actual query.


On 13.01.2012 00:03, Koert Kuipers wrote:

I tried set mapred.reduce.tasks = xyz; hive ignored it.
Selects run fine. The query uses 44 mappers.

On Thu, Jan 12, 2012 at 6:00 PM, Wojciech Langiewicz
<wlangiew...@gmail.com>wrote:

  Hello,

Have you tried running only select, without creating table? What are
results?
How did you tried to set number of reducers? Have you used this:
set mapred.reduce.tasks = xyz;
How many mappers does this query use?


On 12.01.2012 23:53, Koert Kuipers wrote:

  I am running a basic join of 2 tables and it will only run with 1

reducer.
why is that? i tried to set the number of reducers and it didn't work.
hive
just ignored it.

create table z as select x.* from table1 x join table2 y where (
x.col1 = y.col1 and
x.col2 = y.col2 and
x.col3 = y.col3 and
x.col4 = y.col4 and
x.col5 = y.col5
);

both tables are backed by multiple files / blocks / chunks


  --

Wojciech Langiewicz

Re: why 1 reducer on simple join?

Reply via email to