[ https://issues.apache.org/jira/browse/HIVE-26653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis updated HIVE-26653: --------------------------------------- Attachment: hive_26653_explain_cbo.txt hive_26653_explain.txt > Wrong results when (map) joining multiple tables on partition column > -------------------------------------------------------------------- > > Key: HIVE-26653 > URL: https://issues.apache.org/jira/browse/HIVE-26653 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Attachments: hive_26653.q, hive_26653_explain.txt, > hive_26653_explain_cbo.txt, table_a.csv, table_b.csv > > > The result of the query must have exactly one row matching the date specified > in the WHERE clause but the query returns nothing. > {code:sql} > CREATE TABLE table_a (`aid` string ) PARTITIONED BY (`p_dt` string) > row format delimited fields terminated by ',' stored as textfile; > LOAD DATA LOCAL INPATH '../../data/files/_tbla.csv' into TABLE table_a; > CREATE TABLE table_b (`bid` string) PARTITIONED BY (`p_dt` string) > row format delimited fields terminated by ',' stored as textfile; > LOAD DATA LOCAL INPATH '../../data/files/_tblb.csv' into TABLE table_b; > set hive.auto.convert.join=true; > set hive.optimize.semijoin.conversion=false; > SELECT a.p_dt > FROM ((SELECT p_dt > FROM table_b > GROUP BY p_dt) a > JOIN > (SELECT p_dt > FROM table_a > GROUP BY p_dt) b ON a.p_dt = b.p_dt > JOIN > (SELECT p_dt > FROM table_a > GROUP BY p_dt) c ON a.p_dt = c.p_dt) > WHERE a.p_dt = translate(cast(to_date(date_sub('2022-08-01', 1)) AS string), > '-', ''); > {code} > +Expected result+ > 20220731 > +Actual result+ > Empty > To reproduce the problem the tables need to have some data. Values in aid and > bid columns are not important. For p_dt column use one of the following > values 20220731, 20220630. > I will attach some sample data with which the problem can be reproduced. The > tables look like below. > ||aid|pdt|| > |611|20220731| > |239|20220630| > |...|...| > The problem can be reproduced via qtest in current master > (commit > [6b05d64ce8c7161415d97a7896ea50025322e30a|https://github.com/apache/hive/commit/6b05d64ce8c7161415d97a7896ea50025322e30a]) > by running the TestMiniLlapLocalCliDriver. > There is specific query plan (will attach shortly) for which the problem > shows up so if the plan changes slightly the problem may not appear anymore; > this is why we need to set explicitly hive.optimize.semijoin.conversion and > hive.auto.convert.join to trigger the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)