[
https://issues.apache.org/jira/browse/CALCITE-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012931#comment-18012931
]
Julian Hyde commented on CALCITE-7126:
--------------------------------------
[~jensen], I read through your PR but it's not easy to tell what the problem is.
My hypothesis is that when an {{Aggregate}} contains both a {{GROUP_ID}} and a
{{GROUPING}} function call, the {{rewriteAggregateWithDuplicateGroupSets}}
rewrite is invoked. The rewrite deals with the {{GROUP_ID}} calls correctly but
fails to handle the {{GROUPING}} functions correctly. The {{GROUPING}}
functions are incorrect because the resulting aggregates (children of the new
{{Union}} operator) have fewer columns in their group key.
If this hypothesis is correct, there is probably a simpler query that
reproduces the problem. I think it would be useful to work through that query.
I don't think the {{allowChangeNullable}} attribute solves the problem. It
certainly doesn't seem central to the problem. I recall that the existing
{{groupCount}} attribute is used to answer the question "can this aggregate
function be ever applied to an empty group?" and I think you should use it,
rather than introduce {{allowChangeNullable}}.
> The calculation result of grouping function is wrong
> ----------------------------------------------------
>
> Key: CALCITE-7126
> URL: https://issues.apache.org/jira/browse/CALCITE-7126
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.40.0
> Reporter: Zhen Chen
> Priority: Major
> Labels: pull-request-available
>
> The calculation result of grouping function is wrong.
> in agg.iq result is
> {code:java}
> +--------+-----+-------+--------+-------------+----------+----------+
> | DEPTNO | JOB | SAL | EXPR$3 | DEPTNO_FLAG | JOB_FLAG | SAL_FLAG |
> +--------+-----+-------+--------+-------------+----------+----------+
> | 10 | aa | | 1250 | 0 | 0 | 1 |
> | 10 | aa | | 1250 | 0 | 0 | 0 |
> | 10 | | 7000 | | 0 | 1 | 0 |
> | 10 | | 10000 | 1000 | 0 | 1 | 0 |
> | 10 | | 11500 | 250 | 0 | 1 | 0 |
> | 20 | aa | | 500 | 0 | 0 | 1 |
> | 20 | aa | | 500 | 0 | 0 | 0 |
> | 20 | | 8000 | 500 | 0 | 1 | 0 |
> +--------+-----+-------+--------+-------------+----------+----------+
> (8 rows)
> {code}
> in pgsql result is
> {code:java}
> deptno | job | sal | sum | deptno_flag | job_flag | sal_flag
> --------+-----+-------+------+-------------+----------+----------
> 10 | aa | | 1250 | 0 | 0 | 1
> 10 | aa | | 1250 | 0 | 0 | 1
> 10 | | 7000 | | 0 | 1 | 0
> 10 | | 10000 | 1000 | 0 | 1 | 0
> 10 | | 11500 | 250 | 0 | 1 | 0
> 20 | aa | | 500 | 0 | 0 | 1
> 20 | aa | | 500 | 0 | 0 | 1
> 20 | | 8000 | 500 | 0 | 1 | 0
> (8 rows)
> {code}
> The sal_flag column has incorrect values in rows 2 and 7.
> SQL is
> {code:java}
> WITH emps_data AS (
> SELECT * FROM (VALUES
> (100, 10, 'Bill', 10000, 1000, 'aa'),
> (110, 10, 'Theodore', 11500, 250, 'aa'),
> (150, 10, 'Sebastian', 7000, NULL, 'aa'),
> (200, 20, 'Eric', 8000, 500, 'aa')
> ) AS t(empno, deptno, ename, sal, comm, job)
> )
> SELECT deptno, job, sal, SUM(comm),
> GROUPING(deptno) AS deptno_flag,
> GROUPING(job) AS job_flag,
> GROUPING(sal) AS sal_flag
> FROM emps_data
> GROUP BY GROUPING SETS ((deptno, job), (deptno, sal), (deptno, job))
> ORDER BY deptno, job, sal;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)