Jesus Camacho Rodriguez created HIVE-13816: ----------------------------------------------
Summary: Infer constants directly when we create semijoin Key: HIVE-13816 URL: https://issues.apache.org/jira/browse/HIVE-13816 Project: Hive Issue Type: Sub-task Components: Parser Affects Versions: 2.1.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Follow-up on HIVE-13068. When we create a left semijoin, we could infer the constants from the SEL below when we create the GB to remove duplicates on the right hand side. Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out {noformat} explain select table1.id, table1.val, table1.val1 from table1 left semi join table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid = 100; {noformat} Plan: {noformat} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: table1 Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((dimid = 100) = true) and (dimid = 100)) (type: boolean) Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int), val (type: string), val1 (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: 100 (type: int), true (type: boolean) sort order: ++ Map-reduce partition columns: 100 (type: int), true (type: boolean) Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col1 (type: string), _col2 (type: string) TableScan alias: table3 Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((id = 100) = true) and (id = 100)) (type: boolean) Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: 100 (type: int), true (type: boolean) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: int), _col1 (type: boolean) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: boolean) sort order: ++ Map-reduce partition columns: _col0 (type: int), _col1 (type: boolean) Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE Reduce Operator Tree: Join Operator condition map: Left Semi Join 0 to 1 keys: 0 100 (type: int), true (type: boolean) 1 _col0 (type: int), _col1 (type: boolean) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)