Hi Pengcheng,

Is there reason why the correlation optimization disabled in tez ?

And even when I change the code to enable the correlation optimization in
tez. I still get the same query plan.

>>> Vertex dependency in root stage
>>> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
>>> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)

On Tue, Sep 1, 2015 at 1:14 AM, Pengcheng Xiong <pxi...@apache.org> wrote:

> Hi Jeff,
>
>      From code base point of view,  YSmart is integrated into Hive on Tez
> because it is one of the optimization of the current Hive. However, from
> the execution point of view, it is now disabled when Hive is running on
> Tez. You may take look at the source code of Hive
>
> Optimizer.java, L175-180:
> {code}
>
> if(HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEOPTCORRELATION) &&
>
>         !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEGROUPBYSKEW)
> &&
>
>         !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.
> HIVE_OPTIMIZE_SKEWJOIN_COMPILETIME) &&
>
>         !isTezExecEngine) {
>
>       transformations.add(new CorrelationOptimizer());
>
>     }
> {code}
>
> Hope it helps.
>
> Best
> Pengcheng Xiong
>
>
> On Mon, Aug 31, 2015 at 12:56 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>
>> The reason why I ask this question is that when I execute the following
>> sql, it will generated a query plan with 4 vertices. But as my
>> understanding if YSmart is integrated into hive, it should only take 3
>> vertices since the join key and group by key are the same. Anybody know
>> this ? Thanks
>>
>>
>> >> insert overwrite directory '/tmp/jzhang/1' select o.o_orderkey as
>> orderkey,count(1)  from lineitem l >> join orders o on
>> l.l_orderkey=o.o_orderkey group by o.o_orderkey;
>>
>> *YSmart Hive Jira*
>>
>> https://issues.apache.org/jira/browse/HIVE-2206
>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>


-- 
Best Regards

Jeff Zhang

Reply via email to