The problem is mixture of issues (HIVE-3411, HIVE-4209, HIVE-4212, HIVE-3464) and still not completely fixed even in trunk.
Will be fixed shortly. 2013/3/29 wzc <wzc1...@gmail.com>: > The bug remains even if I apply the patch in HIVE-4206 :( The explain > result hasn't change. > > > 2013/3/28 Navis류승우 <navis....@nexr.com> >> >> It's a bug (https://issues.apache.org/jira/browse/HIVE-4206). >> >> Thanks for reporting it. >> >> 2013/3/24 wzc <wzc1...@gmail.com>: >> > Recently we tried to upgrade our hive from 0.9 to 0.10, but found some >> > of >> > our hive queries almost 7 times slow. One of such query consists >> > multiple >> > table outer join on the same key. By looking into the query, we found >> > the >> > query plans generate by hive 0.9 and hive 0.10 are different. Here is >> > the >> > example: >> > >> > testcase: >> > >> > use default; >> > create table test_join ( >> > `key` string, >> > `value` string >> > ); >> > >> > explain select >> > sum(a.value) val >> > from default.test_join a >> > left outer join default.test_join b on a.key = b.key >> > left outer join default.test_join c on a.key = c.key >> > left outer join default.test_join d on a.key = d.key >> > left outer join default.test_join e on a.key = e.key >> > left outer join default.test_join f on a.key = f.key >> > left outer join default.test_join g on a.key = g.key >> > >> > >> > the explain of hive 0.9: >> > >> > STAGE DEPENDENCIES: >> > >> > Stage-1 is a root stage >> > >> > Stage-2 depends on stages: Stage-1 >> > >> > Stage-0 is a root stage >> > >> > ... >> > >> > Reduce Operator Tree: >> > >> > Join Operator >> > >> > condition map: >> > >> > Left Outer Join0 to 1 >> > >> > Left Outer Join0 to 2 >> > >> > Left Outer Join0 to 3 >> > >> > Left Outer Join0 to 4 >> > >> > Left Outer Join0 to 5 >> > >> > Left Outer Join0 to 6 >> > >> > condition expressions: >> > >> > 0 {VALUE._col1} >> > >> > 1 >> > >> > 2 >> > >> > 3 >> > >> > 4 >> > >> > 5 >> > >> > 6 >> > >> > ...... >> > >> > >> > while the explain of hive 0.10: >> > >> > STAGE DEPENDENCIES: >> > >> > Stage-6 is a root stage >> > >> > Stage-1 depends on stages: Stage-6 >> > >> > Stage-2 depends on stages: Stage-1 >> > >> > Stage-0 is a root stage >> > >> > ... >> > >> > Reduce Operator Tree: >> > >> > Join Operator >> > >> > condition map: >> > >> > Left Outer Join0 to 1 >> > >> > Left Outer Join0 to 2 >> > >> > condition expressions: >> > >> > 0 {VALUE._col0} {VALUE._col1} >> > >> > 1 >> > >> > 2 >> > >> > ... >> > >> > Reduce Operator Tree: >> > >> > Join Operator >> > >> > condition map: >> > >> > Left Outer Join0 to 1 >> > >> > Left Outer Join0 to 2 >> > >> > Left Outer Join0 to 3 >> > >> > Left Outer Join0 to 4 >> > >> > condition expressions: >> > >> > 0 {VALUE._col9} >> > >> > 1 >> > >> > 2 >> > >> > 3 >> > >> > 4 >> > >> > .... >> > >> > >> > It seems like hive 0.9 use only one stage/job to process all outer joins >> > but >> > hive 0.10 split them into two stage. When running such kind of query on >> > hive0.10 in production, in the second stage of outer join process, some >> > reducer stucks. >> > >> > I can't find any param to change the query plain , can anyone give me >> > some >> > hint? >> > >> > Thanks! >> > >> > >> > >> > > >