It's a bug (https://issues.apache.org/jira/browse/HIVE-4206).
Thanks for reporting it. 2013/3/24 wzc <wzc1...@gmail.com>: > Recently we tried to upgrade our hive from 0.9 to 0.10, but found some of > our hive queries almost 7 times slow. One of such query consists multiple > table outer join on the same key. By looking into the query, we found the > query plans generate by hive 0.9 and hive 0.10 are different. Here is the > example: > > testcase: > > use default; > create table test_join ( > `key` string, > `value` string > ); > > explain select > sum(a.value) val > from default.test_join a > left outer join default.test_join b on a.key = b.key > left outer join default.test_join c on a.key = c.key > left outer join default.test_join d on a.key = d.key > left outer join default.test_join e on a.key = e.key > left outer join default.test_join f on a.key = f.key > left outer join default.test_join g on a.key = g.key > > > the explain of hive 0.9: > > STAGE DEPENDENCIES: > > Stage-1 is a root stage > > Stage-2 depends on stages: Stage-1 > > Stage-0 is a root stage > > ... > > Reduce Operator Tree: > > Join Operator > > condition map: > > Left Outer Join0 to 1 > > Left Outer Join0 to 2 > > Left Outer Join0 to 3 > > Left Outer Join0 to 4 > > Left Outer Join0 to 5 > > Left Outer Join0 to 6 > > condition expressions: > > 0 {VALUE._col1} > > 1 > > 2 > > 3 > > 4 > > 5 > > 6 > > ...... > > > while the explain of hive 0.10: > > STAGE DEPENDENCIES: > > Stage-6 is a root stage > > Stage-1 depends on stages: Stage-6 > > Stage-2 depends on stages: Stage-1 > > Stage-0 is a root stage > > ... > > Reduce Operator Tree: > > Join Operator > > condition map: > > Left Outer Join0 to 1 > > Left Outer Join0 to 2 > > condition expressions: > > 0 {VALUE._col0} {VALUE._col1} > > 1 > > 2 > > ... > > Reduce Operator Tree: > > Join Operator > > condition map: > > Left Outer Join0 to 1 > > Left Outer Join0 to 2 > > Left Outer Join0 to 3 > > Left Outer Join0 to 4 > > condition expressions: > > 0 {VALUE._col9} > > 1 > > 2 > > 3 > > 4 > > .... > > > It seems like hive 0.9 use only one stage/job to process all outer joins but > hive 0.10 split them into two stage. When running such kind of query on > hive0.10 in production, in the second stage of outer join process, some > reducer stucks. > > I can't find any param to change the query plain , can anyone give me some > hint? > > Thanks! > > > >