[ https://issues.apache.org/jira/browse/HIVE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lotan updated HIVE-26678: ------------------------- Fix Version/s: (was: 4.0.0) > In the filter criteria associated with multiple tables, the filter result of > the subquery by not in or in is incorrect. > ----------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-26678 > URL: https://issues.apache.org/jira/browse/HIVE-26678 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 3.1.0 > Reporter: lotan > Priority: Major > > create testtable as follow: > create table test101 (id string,id2 string); > create table test102 (id string,id2 string); > create table test103 (id string,id2 string); > create table test104 (id string,id2 string); > when cbo is false,run the following SQL statement: > explain select count(1) from test101 t1 > left join test102 t2 on t1.id=t2.id > left join test103 t3 on t1.id=t3.id2 > where t1.id in (select s.id from test104 s) > and t3.id2='123'; > you will see: > The filter criteria in the right table are lost. > The execution plan is as follows: > +-----------------------------------------------------------------------------------------------------+ > | Explain > | > +-----------------------------------------------------------------------------------------------------+ > | STAGE DEPENDENCIES: > | > | Stage-9 is a root stage > | > | Stage-3 depends on stages: Stage-9 > | > | Stage-0 depends on stages: Stage-3 > | > | > | > | STAGE PLANS: > | > | Stage: Stage-9 > | > | Map Reduce Local Work > | > | Alias -> Map Local Tables: > | > | sq_1:s > | > | Fetch Operator > | > | limit: -1 > | > | t2 > | > | Fetch Operator > | > | limit: -1 > | > | t3 > | > | Fetch Operator > | > | limit: -1 > | > | Alias -> Map Local Operator Tree: > | > | sq_1:s > | > | TableScan > | > | alias: s > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Filter Operator > | > | predicate: id is not null (type: boolean) > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Select Operator > | > | expressions: id (type: string) > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Group By Operator > | > | keys: _col0 (type: string) > | > | mode: hash > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | HashTable Sink Operator > | > | keys: > | > | 0 _col0 (type: string) > | > | 1 _col0 (type: string) > | > | t2 > | > | TableScan > | > | alias: t2 > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Filter Operator > | > | predicate: id is not null (type: boolean) > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | HashTable Sink Operator > | > | keys: > | > | 0 id (type: string) > | > | 1 id (type: string) > | > | 2 id2 (type: string) > | > | t3 > | > | TableScan > | > | alias: t3 > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Filter Operator > | > | predicate: id2 is not null (type: boolean) > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | HashTable Sink Operator > | > | keys: > | > | 0 id (type: string) > | > | 1 id (type: string) > | > | 2 id2 (type: string) > | > | > | > | Stage: Stage-3 > | > | Map Reduce > | > | Map Operator Tree: > | > | TableScan > | > | alias: t1 > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Filter Operator > | > | predicate: id is not null (type: boolean) > | > | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Map Join Operator > | > | condition map: > | > | Left Outer Join 0 to 1 > | > | Left Outer Join 0 to 2 > | > | keys: > | > | 0 id (type: string) > | > | 1 id (type: string) > | > | 2 id2 (type: string) > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Map Join Operator > | > | condition map: > | > | Left Semi Join 0 to 1 > | > | keys: > | > | 0 _col0 (type: string) > | > | 1 _col0 (type: string) > | > | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL > Column stats: NONE | > | Group By Operator > | > | aggregations: count(1) > | > | mode: hash > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > PARTIAL Column stats: NONE | > | Reduce Output Operator > | > | sort order: > | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > PARTIAL Column stats: NONE | > | value expressions: _col0 (type: bigint) > | > | Local Work: > | > +-----------------------------------------------------------------------------------------------------+ > | Explain > | > +-----------------------------------------------------------------------------------------------------+ > | Map Reduce Local Work > | > | Reduce Operator Tree: > | > | Group By Operator > | > | aggregations: count(VALUE._col0) > | > | mode: mergepartial > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column > stats: NONE | > | File Output Operator > | > | compressed: false > | > | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL > Column stats: NONE | > | table: > | > | input format: > org.apache.hadoop.mapred.SequenceFileInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | > | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > | > | > | > | Stage: Stage-0 > | > | Fetch Operator > | > | limit: -1 > | > | Processor Tree: > | > | ListSink > | > | > | > +-----------------------------------------------------------------------------------------------------+ -- This message was sent by Atlassian Jira (v8.20.10#820010)