[ https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786224#comment-17786224 ]
Seonggon Namgung commented on HIVE-26986: ----------------------------------------- @kkasa 1. This issue is not about data correctness; this issue addresses the insertion of unnecessary ReduceSink operators, which causes unnecessary shuffle during runtime. The unnecessary insertion is performed by ParallelEdgeFixer(PEF), and it makes a wrong decision because OperatorGraph creates wrong a DAG from the given query plan. My previous comments explains how OperatorGraph groups operators into a vertex(cluster in terms of OperatorGraph) in the wrong way. Since this issue originates from OperatorGraph, not PEF or SharedWorkOptimizer(SWO), the submitted PR introduces TestOperatorGraph, which tests the behaviour of OperatorGraph. You can check the problem by running this test using master branch. The following comment explains about the added test for the sake of your better understanding. The test compares 2 DAGs generated by OperatorGraph and TezCompiler. The following graph represents the query plan used in the test. TS1┐ TS2┴UNION─SEL─RS─GBY─RS The correct DAG corresponding to the query plan should be: Map1: \{TS1, SEL, RS1} Map2: \{TS2, SEL, RS1} Reduce: \{GBY, RS2} But current OperatorGraph groups operator into 2 groups as following: Cluster1: \{TS1, TS2, UNION, SEL, RS1} Cluster2: \{GBY, RS2} 2. As I mentioned above, this issue is unrelated to data correctness. Moreover, PEF is applied on a query plan regardless of the value of `hive.optimize.shared.work.parallel.edge.support`. I think the test attached in the PR is sufficient to verify this issue. FYI, `hive.optimize.shared.work.parallel.edge.support` controls the types of edges that are allowed to construct a parallel edge. If it is set to true, DynamicPartitionPruning(DPP), SemiJoinReduction, and Broadcast edges can construct parallel edge. If not, only DPP edges can construct parallel edge. As a consequence, SWO can make parallel edges regardless of the value of `hive.optimize.shared.work.parallel.edge.support`. So Hive always runs PEF after SWO in order to resolve parallel edges by adding extra RS operators. > A DAG created by OperatorGraph is not equal to the Tez DAG. > ----------------------------------------------------------- > > Key: HIVE-26986 > URL: https://issues.apache.org/jira/browse/HIVE-26986 > Project: Hive > Issue Type: Sub-task > Affects Versions: 4.0.0-alpha-2 > Reporter: Seonggon Namgung > Assignee: Seonggon Namgung > Priority: Major > Labels: hive-4.0.0-must, pull-request-available > Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png > > Time Spent: 50m > Remaining Estimate: 0h > > A DAG created by OperatorGraph is not equal to the corresponding DAG that is > submitted to Tez. > Because of this problem, ParallelEdgeFixer reports a pair of normal edges to > a parallel edge. > We observe this problem by comparing OperatorGraph and Tez DAG when running > TPC-DS query 71 on 1TB ORC format managed table. -- This message was sent by Atlassian Jira (v8.20.10#820010)