[ 
https://issues.apache.org/jira/browse/HIVE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576918#comment-16576918
 ] 

Aihua Xu commented on HIVE-20331:
---------------------------------

GenMRRedSink3 won't get triggered in this case (it's triggered in union 
followed by RS operator).

Here are the plans with and without the patch. As you can see, the union 
operator is incorrectly to Stage-4.

Before:
{noformat}
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-6 depends on stages: Stage-4
  Stage-2 depends on stages: Stage-6
  Stage-0 depends on stages: Stage-2

STAGE PLANS:
  Stage: Stage-4
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
            Reduce Output Operator
              key expressions: col1 (type: int)
              sort order: +
              Map-reduce partition columns: col1 (type: int)
              Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
          TableScan
            Union
              Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
              File Output Operator
                compressed: false
                Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                table:
                    input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                    output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Reduce Operator Tree:
        Select Operator
          expressions: KEY.reducesinkkey0 (type: int)
          outputColumnNames: _col0
          Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
          PTF Operator
            Function definitions:
                Input definition
                  input alias: ptf_0
                  output shape: _col0: int
                  type: WINDOWING
                Windowing table definition
                  input alias: ptf_1
                  name: windowingtablefunction
                  order by: _col0 ASC NULLS FIRST
                  partition by: _col0
                  raw input shape:
                  window functions:
                      window function definition
                        alias: Row_Number_window_0
                        name: Row_Number
                        window function: GenericUDAFRowNumberEvaluator
                        window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
                        isPivotResult: true
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
            Select Operator
              Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
              Lateral View Forward
                Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                Select Operator
                  Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                  Lateral View Join Operator
                    outputColumnNames: _col1, _col2
                    Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE 
Column stats: NONE
                    Select Operator
                      Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                      File Output Operator
                        compressed: false
                        table:
                            input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                            output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                            serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
                Select Operator
                  expressions: map(10:1) (type: map<int,int>)
                  outputColumnNames: _col0
                  Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                  UDTF Operator
                    Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                    function name: explode
                    Lateral View Join Operator
                      outputColumnNames: _col1, _col2
                      Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                      Select Operator
                        Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                        File Output Operator
                          compressed: false
                          table:
                              input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                              output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                              serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe

  Stage: Stage-6
    Map Reduce Local Work
      Alias -> Map Local Tables:
        _u1-subquery2:x1:t1
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        _u1-subquery2:x1:t1
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: COMPLETE
            Select Operator
              Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE 
Column stats: COMPLETE
              HashTable Sink Operator
                keys:
                  0
                  1

  Stage: Stage-2
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: COMPLETE
            Select Operator
              expressions: 1 (type: int)
              outputColumnNames: _col0
              Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column 
stats: COMPLETE
              Union
                Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                  table:
                      input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                      output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          TableScan
            Map Join Operator
              condition map:
                   Inner Join 0 to 1
              keys:
                0
                1
              Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE 
Column stats: NONE
              Select Operator
                expressions: 2 (type: int)
                outputColumnNames: _col0
                Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE 
Column stats: NONE
                Union
                  Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 10 Data size: 88 Basic stats: 
COMPLETE Column stats: PARTIAL
                    table:
                        input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                        serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Local Work:
        Map Reduce Local Work

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{noformat}
After 
{noformat}
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-6 depends on stages: Stage-4
  Stage-2 depends on stages: Stage-6
  Stage-0 depends on stages: Stage-2

STAGE PLANS:
  Stage: Stage-4
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
            Reduce Output Operator
              key expressions: col1 (type: int)
              sort order: +
              Map-reduce partition columns: col1 (type: int)
              Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
      Execution mode: vectorized
      Reduce Operator Tree:
        Select Operator
          expressions: KEY.reducesinkkey0 (type: int)
          outputColumnNames: _col0
          Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
          PTF Operator
            Function definitions:
                Input definition
                  input alias: ptf_0
                  output shape: _col0: int
                  type: WINDOWING
                Windowing table definition
                  input alias: ptf_1
                  name: windowingtablefunction
                  order by: _col0 ASC NULLS FIRST
                  partition by: _col0
                  raw input shape:
                  window functions:
                      window function definition
                        alias: Row_Number_window_0
                        name: Row_Number
                        window function: GenericUDAFRowNumberEvaluator
                        window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
                        isPivotResult: true
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
            Select Operator
              Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: NONE
              Lateral View Forward
                Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                Select Operator
                  Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                  Lateral View Join Operator
                    outputColumnNames: _col1, _col2
                    Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE 
Column stats: NONE
                    Select Operator
                      Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                      File Output Operator
                        compressed: false
                        table:
                            input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                            output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                            serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
                Select Operator
                  expressions: map(10:1) (type: map<int,int>)
                  outputColumnNames: _col0
                  Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                  UDTF Operator
                    Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
Column stats: NONE
                    function name: explode
                    Lateral View Join Operator
                      outputColumnNames: _col1, _col2
                      Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                      Select Operator
                        Statistics: Num rows: 4 Data size: 4 Basic stats: 
COMPLETE Column stats: NONE
                        File Output Operator
                          compressed: false
                          table:
                              input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                              output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                              serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe

  Stage: Stage-6
    Map Reduce Local Work
      Alias -> Map Local Tables:
        _u1-subquery2:x1:t1
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        _u1-subquery2:x1:t1
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: COMPLETE
            Select Operator
              Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE 
Column stats: COMPLETE
              HashTable Sink Operator
                keys:
                  0
                  1

  Stage: Stage-2
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: t1
            Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column 
stats: COMPLETE
            Select Operator
              expressions: 1 (type: int)
              outputColumnNames: _col0
              Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column 
stats: COMPLETE
              Union
                Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                  table:
                      input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                      output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          TableScan
            Map Join Operator
              condition map:
                   Inner Join 0 to 1
              keys:
                0
                1
              Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE 
Column stats: NONE
              Select Operator
                expressions: 2 (type: int)
                outputColumnNames: _col0
                Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE 
Column stats: NONE
                Union
                  Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE 
Column stats: PARTIAL
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 10 Data size: 88 Basic stats: 
COMPLETE Column stats: PARTIAL
                    table:
                        input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                        serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Local Work:
        Map Reduce Local Work

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{noformat}

 

> Query with union all, lateral view and Join fails with "cannot find parent in 
> the child operator"
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20331
>                 URL: https://issues.apache.org/jira/browse/HIVE-20331
>             Project: Hive
>          Issue Type: Bug
>          Components: Physical Optimizer
>    Affects Versions: 2.1.1
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>            Priority: Major
>         Attachments: HIVE-20331.1.patch
>
>
> The following query with Union, Lateral view and Join will fail during 
> execution with the exception below.
> {noformat}
> create table t1(col1 int);
> SELECT 1 AS `col1`
> FROM t1
> UNION ALL
>   SELECT 2 AS `col1`
>   FROM
>     (SELECT col1
>      FROM t1
>     ) x1
>     JOIN
>       (SELECT col1
>       FROM
>         (SELECT 
>           Row_Number() over (PARTITION BY col1 ORDER BY col1) AS `col1`
>         FROM t1
>         ) x2 lateral VIEW explode(map(10,1))`mapObj` AS `col2`, `col3`
>       ) `expdObj`      
> {noformat}
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive internal 
> error: cannot find parent in the child operator!
>         at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:509)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {noformat}
> After debugging, seems we have issues in GenMRFileSink1 class in which we are 
> setting incorrect aliasToWork to the MapWork.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to