[
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902171#comment-17902171
]
Sungwoo Park commented on HIVE-26986:
-------------------------------------
The performance gain in the 10TB TPC-DS test is small, but it is because of the
complexity of DAGs produced by those affected queries (where removing a few RSs
does not change the running time noticeably).
However, this patch clearly fixes a bug and also brings some performance
improvement. We don't have a sample query to demonstrate the performance
improvement from this patch, but I think some queries (similar to union10.q) on
huge datasets will clearly benefit from this patch.
> SWO and PEF make wrong decisions (e.g. by inserting unnecessary RSs) due to
> inconsistency between DAGs produced by OperatorGraph and Tez DAGs.
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
> Issue Type: Sub-task
> Affects Versions: 4.0.0-alpha-2
> Reporter: Seonggon Namgung
> Assignee: Seonggon Namgung
> Priority: Major
> Labels: hive-4.1.0-must, pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running
> TPC-DS query 71 on 1TB ORC format managed table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)