[ https://issues.apache.org/jira/browse/HIVE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576918#comment-16576918 ]
Aihua Xu commented on HIVE-20331: --------------------------------- GenMRRedSink3 won't get triggered in this case (it's triggered in union followed by RS operator). Here are the plans with and without the patch. As you can see, the union operator is incorrectly to Stage-4. Before: {noformat} STAGE DEPENDENCIES: Stage-4 is a root stage Stage-6 depends on stages: Stage-4 Stage-2 depends on stages: Stage-6 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-4 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: col1 (type: int) sort order: + Map-reduce partition columns: col1 (type: int) Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE TableScan Union Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE PTF Operator Function definitions: Input definition input alias: ptf_0 output shape: _col0: int type: WINDOWING Windowing table definition input alias: ptf_1 name: windowingtablefunction order by: _col0 ASC NULLS FIRST partition by: _col0 raw input shape: window functions: window function definition alias: Row_Number_window_0 name: Row_Number window function: GenericUDAFRowNumberEvaluator window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX) isPivotResult: true Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Lateral View Forward Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col1, _col2 Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Select Operator expressions: map(10:1) (type: map<int,int>) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE function name: explode Lateral View Join Operator outputColumnNames: _col1, _col2 Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-6 Map Reduce Local Work Alias -> Map Local Tables: _u1-subquery2:x1:t1 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: _u1-subquery2:x1:t1 TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: COMPLETE Select Operator Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator keys: 0 1 Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: 1 (type: int) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Union Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe TableScan Map Join Operator condition map: Inner Join 0 to 1 keys: 0 1 Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: 2 (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE Column stats: NONE Union Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {noformat} After {noformat} STAGE DEPENDENCIES: Stage-4 is a root stage Stage-6 depends on stages: Stage-4 Stage-2 depends on stages: Stage-6 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-4 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: col1 (type: int) sort order: + Map-reduce partition columns: col1 (type: int) Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE PTF Operator Function definitions: Input definition input alias: ptf_0 output shape: _col0: int type: WINDOWING Windowing table definition input alias: ptf_1 name: windowingtablefunction order by: _col0 ASC NULLS FIRST partition by: _col0 raw input shape: window functions: window function definition alias: Row_Number_window_0 name: Row_Number window function: GenericUDAFRowNumberEvaluator window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX) isPivotResult: true Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Lateral View Forward Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE Lateral View Join Operator outputColumnNames: _col1, _col2 Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Select Operator expressions: map(10:1) (type: map<int,int>) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE UDTF Operator Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE function name: explode Lateral View Join Operator outputColumnNames: _col1, _col2 Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-6 Map Reduce Local Work Alias -> Map Local Tables: _u1-subquery2:x1:t1 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: _u1-subquery2:x1:t1 TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: COMPLETE Select Operator Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator keys: 0 1 Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: 1 (type: int) outputColumnNames: _col0 Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Union Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe TableScan Map Join Operator condition map: Inner Join 0 to 1 keys: 0 1 Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: 2 (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE Column stats: NONE Union Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {noformat} > Query with union all, lateral view and Join fails with "cannot find parent in > the child operator" > ------------------------------------------------------------------------------------------------- > > Key: HIVE-20331 > URL: https://issues.apache.org/jira/browse/HIVE-20331 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 2.1.1 > Reporter: Aihua Xu > Assignee: Aihua Xu > Priority: Major > Attachments: HIVE-20331.1.patch > > > The following query with Union, Lateral view and Join will fail during > execution with the exception below. > {noformat} > create table t1(col1 int); > SELECT 1 AS `col1` > FROM t1 > UNION ALL > SELECT 2 AS `col1` > FROM > (SELECT col1 > FROM t1 > ) x1 > JOIN > (SELECT col1 > FROM > (SELECT > Row_Number() over (PARTITION BY col1 ORDER BY col1) AS `col1` > FROM t1 > ) x2 lateral VIEW explode(map(10,1))`mapObj` AS `col2`, `col3` > ) `expdObj` > {noformat} > {noformat} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive internal > error: cannot find parent in the child operator! > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:509) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > {noformat} > After debugging, seems we have issues in GenMRFileSink1 class in which we are > setting incorrect aliasToWork to the MapWork. -- This message was sent by Atlassian JIRA (v7.6.3#76005)