[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902171#comment-17902171
 ] 

Sungwoo Park commented on HIVE-26986:
-------------------------------------

The performance gain in the 10TB TPC-DS test is small, but it is because of the 
complexity of DAGs produced by those affected queries (where removing a few RSs 
does not change the running time noticeably).
However, this patch clearly fixes a bug and also brings some performance 
improvement. We don't have a sample query to demonstrate the performance 
improvement from this patch, but I think some queries (similar to union10.q) on 
huge datasets will clearly benefit from this patch.

> SWO and PEF make wrong decisions (e.g. by inserting unnecessary RSs) due to 
> inconsistency between DAGs produced by OperatorGraph and Tez DAGs.
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-26986
>                 URL: https://issues.apache.org/jira/browse/HIVE-26986
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Major
>              Labels: hive-4.1.0-must, pull-request-available
>         Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to