Vineet Garg created HIVE-21323: ---------------------------------- Summary: LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side Key: HIVE-21323 URL: https://issues.apache.org/jira/browse/HIVE-21323 Project: Hive Issue Type: Improvement Reporter: Vineet Garg Fix For: 4.0.0
{code:sql} select a.id from a left outer join c on a.id = c.id {code} CBO plan: {code:sql} iveProject(id=[$0]) HiveJoin(condition=[=($0, $1)], joinType=[left], algorithm=[none], cost=[{6.0 rows, 0.0 cpu, 0.0 io}]) HiveProject(id=[$0]) HiveTableScan(table=[[hive_21322, a]], table:alias=[a]) HiveProject(id=[$0]) HiveTableScan(table=[[hive_21322, c]], table:alias=[c]) {code} Explain Plan: {code:sql} Stage: Stage-1 Tez DagId: vgarg_20190225222008_083d8041-b5dc-4af1-9dac-4ff5305ab864:10 Edges: Map 1 <- Map 2 (BROADCAST_EDGE) DagName: vgarg_20190225222008_083d8041-b5dc-4af1-9dac-4ff5305ab864:10 Vertices: Map 1 Map Operator Tree: TableScan alias: a Statistics: Num rows: 3 Data size: 255 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: id (type: string) outputColumnNames: _col0 Statistics: Num rows: 3 Data size: 255 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Left Outer Join 0 to 1 keys: 0 _col0 (type: string) 1 _col0 (type: string) outputColumnNames: _col0 input vertices: 1 Map 2 Statistics: Num rows: 3 Data size: 255 Basic stats: COMPLETE Column stats: COMPLETE HybridGraceHashJoin: true File Output Operator compressed: false Statistics: Num rows: 3 Data size: 255 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: c Statistics: Num rows: 3 Data size: 258 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: id (type: string) outputColumnNames: _col0 Statistics: Num rows: 3 Data size: 258 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 3 Data size: 258 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} There is no IS NOT NULL filter on {{c.id}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)