Hi Pengcheng, Is there reason why the correlation optimization disabled in tez ?
And even when I change the code to enable the correlation optimization in tez. I still get the same query plan. >>> Vertex dependency in root stage >>> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) >>> Reducer 3 <- Reducer 2 (SIMPLE_EDGE) On Tue, Sep 1, 2015 at 1:14 AM, Pengcheng Xiong <pxi...@apache.org> wrote: > Hi Jeff, > > From code base point of view, YSmart is integrated into Hive on Tez > because it is one of the optimization of the current Hive. However, from > the execution point of view, it is now disabled when Hive is running on > Tez. You may take look at the source code of Hive > > Optimizer.java, L175-180: > {code} > > if(HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEOPTCORRELATION) && > > !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEGROUPBYSKEW) > && > > !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars. > HIVE_OPTIMIZE_SKEWJOIN_COMPILETIME) && > > !isTezExecEngine) { > > transformations.add(new CorrelationOptimizer()); > > } > {code} > > Hope it helps. > > Best > Pengcheng Xiong > > > On Mon, Aug 31, 2015 at 12:56 AM, Jeff Zhang <zjf...@gmail.com> wrote: > >> The reason why I ask this question is that when I execute the following >> sql, it will generated a query plan with 4 vertices. But as my >> understanding if YSmart is integrated into hive, it should only take 3 >> vertices since the join key and group by key are the same. Anybody know >> this ? Thanks >> >> >> >> insert overwrite directory '/tmp/jzhang/1' select o.o_orderkey as >> orderkey,count(1) from lineitem l >> join orders o on >> l.l_orderkey=o.o_orderkey group by o.o_orderkey; >> >> *YSmart Hive Jira* >> >> https://issues.apache.org/jira/browse/HIVE-2206 >> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > -- Best Regards Jeff Zhang