[jira] [Created] (HIVE-13816) Infer constants directly when we create semijoin

Jesus Camacho Rodriguez (JIRA) Sat, 21 May 2016 13:29:24 -0700

Jesus Camacho Rodriguez created HIVE-13816:
----------------------------------------------


             Summary: Infer constants directly when we create semijoin
                 Key: HIVE-13816
                 URL: https://issues.apache.org/jira/browse/HIVE-13816
             Project: Hive
          Issue Type: Sub-task
          Components: Parser
    Affects Versions: 2.1.0
            Reporter: Jesus Camacho Rodriguez
            Assignee: Jesus Camacho Rodriguez


Follow-up on HIVE-13068.

When we create a left semijoin, we could infer the constants from the SEL below 
when we create the GB to remove duplicates on the right hand side.

Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out

{noformat}
explain select table1.id, table1.val, table1.val1 from table1 left semi join 
table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid  = 
100;
{noformat}

Plan:
{noformat}
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: table1
            Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE 
Column stats: NONE
            Filter Operator
              predicate: (((dimid = 100) = true) and (dimid = 100)) (type: 
boolean)
              Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
Column stats: NONE
              Select Operator
                expressions: id (type: int), val (type: string), val1 (type: 
string)
                outputColumnNames: _col0, _col1, _col2
                Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
Column stats: NONE
                Reduce Output Operator
                  key expressions: 100 (type: int), true (type: boolean)
                  sort order: ++
                  Map-reduce partition columns: 100 (type: int), true (type: 
boolean)
                  Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
Column stats: NONE
                  value expressions: _col0 (type: int), _col1 (type: string), 
_col2 (type: string)
          TableScan
            alias: table3
            Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE Column 
stats: NONE
            Filter Operator
              predicate: (((id = 100) = true) and (id = 100)) (type: boolean)
              Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column 
stats: NONE
              Select Operator
                expressions: 100 (type: int), true (type: boolean)
                outputColumnNames: _col0, _col1
                Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
Column stats: NONE
                Group By Operator
                  keys: _col0 (type: int), _col1 (type: boolean)
                  mode: hash
                  outputColumnNames: _col0, _col1
                  Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
Column stats: NONE
                  Reduce Output Operator
                    key expressions: _col0 (type: int), _col1 (type: boolean)
                    sort order: ++
                    Map-reduce partition columns: _col0 (type: int), _col1 
(type: boolean)
                    Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
Column stats: NONE
      Reduce Operator Tree:
        Join Operator
          condition map:
               Left Semi Join 0 to 1
          keys:
            0 100 (type: int), true (type: boolean)
            1 _col0 (type: int), _col1 (type: boolean)
          outputColumnNames: _col0, _col1, _col2
          Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-13816) Infer constants directly when we create semijoin

Reply via email to