[ https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aihua Xu updated HIVE-12762: ---------------------------- Affects Version/s: 2.0.0 > Common join on parquet tables returns incorrect result when > hive.optimize.index.filter set to true > -------------------------------------------------------------------------------------------------- > > Key: HIVE-12762 > URL: https://issues.apache.org/jira/browse/HIVE-12762 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Affects Versions: 2.0.0, 2.1.0 > Reporter: Aihua Xu > Assignee: Aihua Xu > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12762.2.patch, HIVE-12762.patch > > > The following query will give incorrect result. > {noformat} > CREATE TABLE tbl1(id INT) STORED AS PARQUET; > INSERT INTO tbl1 VALUES(1), (2); > CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET; > INSERT INTO tbl2 VALUES(1, 'value1'); > INSERT INTO tbl2 VALUES(1, 'value2'); > set hive.optimize.index.filter = true; > set hive.auto.convert.join=false; > select tbl1.id, t1.value, t2.value > FROM tbl1 > JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id > JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id; > {noformat} > We are enforcing to use common join and tbl2 will have 2 files after 2 > insertions underneath. > the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When > hive.optimize.index.filter is set to true, we are incorrectly applying the > later filtering condition to each block, which causes no data is returned for > the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)