[ https://issues.apache.org/jira/browse/HIVE-28254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhihua Deng updated HIVE-28254: ------------------------------- Labels: hive-4.0.1-merged hive-4.0.1-must pull-request-available (was: hive-4.0.1-must pull-request-available) > CBO (Calcite Return Path): Multiple DISTINCT leads to wrong results > ------------------------------------------------------------------- > > Key: HIVE-28254 > URL: https://issues.apache.org/jira/browse/HIVE-28254 > Project: Hive > Issue Type: Sub-task > Components: CBO > Affects Versions: 4.0.0 > Reporter: Shohei Okumiya > Assignee: Shohei Okumiya > Priority: Major > Labels: hive-4.0.1-merged, hive-4.0.1-must, > pull-request-available > Fix For: 4.1.0 > > > CBO return path can build incorrect GroupByOperator when multiple > aggregations with DISTINCT are involved. > This is an example. > {code:java} > CREATE TABLE test (col1 INT, col2 INT); > INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300); > set hive.cbo.returnpath.hiveop=true; > set hive.map.aggr=false; > SELECT > SUM(DISTINCT col1), > COUNT(DISTINCT col1), > SUM(DISTINCT col2), > SUM(col2) > FROM test;{code} > The last column should be 800. But the SUM refers to col1 and the actual > result is 8. > {code:java} > +------+------+------+------+ > | _c0 | _c1 | _c2 | _c3 | > +------+------+------+------+ > | 6 | 3 | 600 | 8 | > +------+------+------+------+ {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)