> On April 27, 2020, 6:02 p.m., Vineet Garg wrote: > > ql/src/test/results/clientpositive/llap/keep_uniform.q.out > > Lines 946 (patched) > > <https://reviews.apache.org/r/72431/diff/2/?file=2227827#file2227827line956> > > > > Why is there an extra join in the plan now? > > Krisztian Kasa wrote: > I run `explain cbo` on master and the branch where this patch is applied > with this query. Both plans has 8 HiveJoin operators. Comparing the CBO plans > I see that 2 of the joins were reordered: > On master: (web_returns + (web_sales + web_sales)) > On the branch: (web_sales + web_returns) + web_sales > > It also turned out that on master the CBO plan contains a HiveProject > with all the columns from table `web_returns`. The reason is just the same as > in case of the example query mentioned in HIVE-23206. This project has only > the necessary columns (wr_order_number only) in the plan created after > applying this patch. > > In the physical plan there are 7 joins on master and 8 when this applied. > SharedWorkOptimizer merge two of them i need to investigate further...
On master two joins are merged because their parent ReduceSinks are merged in `sharedWorkExtendedOptimization`. See the plan after `SharedWorkOptimizer` before `SharedWorkExtendedOptimizer`: Plan on master ``` TS[0]-FIL[102]-SEL[2]-RS[47]-MERGEJOIN[231]-RS[50]-MERGEJOIN[232]-RS[53]-MERGEJOIN[236]-RS[56]-MERGEJOIN[237]-RS[59]-MERGEJOIN[238]-GBY[111]-RS[112]-GBY[113]-GBY[114]-RS[115]-GBY[116]-FS[67] TS[3]-FIL[103]-SEL[5]-RS[48]-MERGEJOIN[231] TS[6]-FIL[104]-SEL[8]-RS[51]-MERGEJOIN[232] TS[9]-FIL[105]-SEL[11]-RS[15]-MERGEJOIN[233]-SEL[18]-GBY[19]-RS[20]-GBY[21]-RS[54]-MERGEJOIN[236] -RS[32]-MERGEJOIN[234]-SEL[35]-RS[37]-MERGEJOIN[235]-GBY[40]-RS[41]-GBY[42]-RS[57]-MERGEJOIN[237] TS[12]-FIL[106]-SEL[14]-RS[16]-MERGEJOIN[233] -RS[33]-MERGEJOIN[234] TS[23]-FIL[107]-SEL[25]-RS[36]-MERGEJOIN[235] TS[44]-FIL[110]-SEL[46]-RS[60]-MERGEJOIN[238] ``` RS[16] and RS[33] were merged. Plan after applying patch ``` TS[0]-FIL[101]-SEL[2]-RS[46]-MERGEJOIN[213]-RS[49]-MERGEJOIN[214]-RS[52]-MERGEJOIN[218]-RS[55]-MERGEJOIN[219]-RS[58]-MERGEJOIN[220]-GBY[110]-RS[111]-GBY[112]-GBY[113]-RS[114]-GBY[115]-FS[66] TS[3]-FIL[102]-SEL[5]-RS[47]-MERGEJOIN[213] TS[6]-FIL[103]-SEL[8]-RS[50]-MERGEJOIN[214] TS[9]-FIL[104]-SEL[11]-RS[15]-MERGEJOIN[215]-SEL[18]-GBY[19]-RS[20]-GBY[21]-RS[53]-MERGEJOIN[218] -RS[32]-MERGEJOIN[216]-RS[35]-MERGEJOIN[217]-SEL[38]-GBY[39]-RS[40]-GBY[41]-RS[56]-MERGEJOIN[219] -RS[36]-MERGEJOIN[217] TS[12]-FIL[105]-SEL[14]-RS[16]-MERGEJOIN[215] TS[26]-FIL[107]-SEL[28]-RS[33]-MERGEJOIN[216] TS[43]-FIL[109]-SEL[45]-RS[59]-MERGEJOIN[220] ``` RS[15] and RS[32] was not merged because MERGEJOIN[215] and MERGEJOIN[216] has different keys. RS[15] and RS[36] was not merged because `tag` is different - Krisztian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72431/#review220503 ----------------------------------------------------------- On May 4, 2020, 5:02 a.m., Krisztian Kasa wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72431/ > ----------------------------------------------------------- > > (Updated May 4, 2020, 5:02 a.m.) > > > Review request for hive, Jesús Camacho Rodríguez, Steve Carlin, and Vineet > Garg. > > > Bugs: HIVE-23206 > https://issues.apache.org/jira/browse/HIVE-23206 > > > Repository: hive-git > > > Description > ------- > > Project not defined correctly after reordering a join > > > Diffs > ----- > > itests/src/test/resources/testconfiguration.properties 5468728f83 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinProjectTransposeRule.java > 492c55e050 > ql/src/test/queries/clientpositive/join_reorder5.q PRE-CREATION > ql/src/test/results/clientpositive/join22.q.out ad34bc4310 > ql/src/test/results/clientpositive/llap/correlationoptimizer3.q.out > f063766a1f > ql/src/test/results/clientpositive/llap/join_reorder5.q.out PRE-CREATION > ql/src/test/results/clientpositive/llap/keep_uniform.q.out 54d0b5fab6 > ql/src/test/results/clientpositive/llap/sharedwork.q.out f8d3b4b2f5 > ql/src/test/results/clientpositive/llap/subquery_select.q.out 311cee743d > ql/src/test/results/clientpositive/perf/tez/cbo_query2.q.out 26a98ffcec > ql/src/test/results/clientpositive/perf/tez/cbo_query59.q.out abc5d999b5 > ql/src/test/results/clientpositive/perf/tez/cbo_query95.q.out 218ca7d8b6 > ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query14.q.out > eaa1defa81 > ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query2.q.out > 4c90da4476 > ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query59.q.out > 8d17cc79d1 > ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query95.q.out > ace074316b > ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out > 8204245245 > ql/src/test/results/clientpositive/perf/tez/constraints/query2.q.out > 66777769e6 > ql/src/test/results/clientpositive/perf/tez/constraints/query59.q.out > f7c7260077 > ql/src/test/results/clientpositive/perf/tez/constraints/query95.q.out > 39d35ec330 > ql/src/test/results/clientpositive/perf/tez/query2.q.out 0e67e97c02 > ql/src/test/results/clientpositive/perf/tez/query59.q.out 1a2ba964f4 > ql/src/test/results/clientpositive/perf/tez/query95.q.out f15afbed4b > ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out > 9547e4fa7c > ql/src/test/results/clientpositive/smb_mapjoin_25.q.out 8fb82e1659 > > > Diff: https://reviews.apache.org/r/72431/diff/4/ > > > Testing > ------- > > mvn test -Dtest.output.overwrite -DskipSparkTests > -Dtest=TestMiniLlapLocalCliDriver -Dqfile=join_reorder5.q -pl itests/qtest > -Pitests > > > Thanks, > > Krisztian Kasa > >