how may map-reduce needed in a hive query

Richard Tue, 22 Jan 2013 19:45:47 -0800

I am wondering how to determine the number of map-reduce for a hive query.


for example, the following query


select 
sum(c1),
sum(c2),
k1
from
{
select transform(*) using 'mymapper'  as c1, c2, k1
from t1
} a group by k1; 


when i run this query, it takes two map-reduce, but I expect it to take only 1.
in the map stage, using 'mymapper' as the mapper, then shuffle the mapper 
output by k1 and perform sum reduce in the reducer.


so why hive takes 2 map-reduce?

how may map-reduce needed in a hive query

Reply via email to