zhong.zhu created KYLIN-5742: -------------------------------- Summary: When the "Group by" group has duplicate values, the result of Grouping Set query is inconsistent with that in SparkSQL Key: KYLIN-5742 URL: https://issues.apache.org/jira/browse/KYLIN-5742 Project: Kylin Issue Type: Bug Affects Versions: 5.0-beta Reporter: zhong.zhu Fix For: 5.0.0 Attachments: image-2023-12-11-14-54-38-652.png, image-2023-12-11-14-55-46-222.png, image-2023-12-11-14-57-32-037.png, image-2023-12-11-14-57-56-771.png
{code:sql} -- sql1 select C_NAME,C_CITY,C_NATION,C_REGION,count(*) FROM SSB.LINEORDER as LINEORDER INNER JOIN SSB.CUSTOMER as CUSTOMER ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' group by GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) order by C_NAME; -- sql2 select C_NAME,C_CITY,C_NATION,C_REGION,count(*) FROM SSB.LINEORDER as LINEORDER INNER JOIN SSB.CUSTOMER as CUSTOMER ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' group by C_NAME,C_CITY,C_NATION,C_REGION, GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) order by C_NAME; -- sql3 select C_NAME,C_CITY,C_NATION,C_REGION,count(*) FROM SSB.LINEORDER as LINEORDER INNER JOIN SSB.CUSTOMER as CUSTOMER ON LINEORDER.LO_CUSTKEY = CUSTOMER.C_CUSTKEY where C_NATION = 'CHINA' and C_CITY = 'CHINA 0' group by C_NAME,C_CITY,C_NATION,C_REGION GROUPING SETS ((),(C_NAME,C_CITY),(C_NATION,C_REGION)) order by C_NAME {code} In spark-sql, sql1 and sql3 query results are consistent as follows: !image-2023-12-11-14-54-38-652.png! In spark-sql, sql 2 the query results are as follows. !image-2023-12-11-14-55-46-222.png! In KYLIN, the query result of sql1 is as follows, which is consistent with the result of spark-sql sql sql1 sql2: !image-2023-12-11-14-57-32-037.png! The query result of sql2 is as follows, which is inconsistent with the spark-sql sql2 result: !image-2023-12-11-14-57-56-771.png! The syntax of sql3 is not supported Hive does not support commas before grouping sets, that is, sql2 is not supported, and the query results of sql1 and sql3 are consistent with spark-sql -- This message was sent by Atlassian Jira (v8.20.10#820010)