[jira] [Commented] (CALCITE-7126) The calculation result of grouping function is wrong

Julian Hyde (Jira) Fri, 08 Aug 2025 15:36:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012931#comment-18012931
 ]


Julian Hyde commented on CALCITE-7126:
--------------------------------------

[~jensen], I read through your PR but it's not easy to tell what the problem is.

My hypothesis is that when an {{Aggregate}} contains both a {{GROUP_ID}} and a 
{{GROUPING}} function call, the {{rewriteAggregateWithDuplicateGroupSets}} 
rewrite is invoked. The rewrite deals with the {{GROUP_ID}} calls correctly but 
fails to handle the {{GROUPING}} functions correctly. The {{GROUPING}} 
functions are incorrect because the resulting aggregates (children of the new 
{{Union}} operator) have fewer columns in their group key.

If this hypothesis is correct, there is probably a simpler query that 
reproduces the problem. I think it would be useful to work through that query.

I don't think the {{allowChangeNullable}} attribute solves the problem. It 
certainly doesn't seem central to the problem. I recall that the existing 
{{groupCount}} attribute is used to answer the question "can this aggregate 
function be ever applied to an empty group?" and I think you should use it, 
rather than introduce {{allowChangeNullable}}.

> The calculation result of grouping function is wrong
> ----------------------------------------------------
>
>                 Key: CALCITE-7126
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7126
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.40.0
>            Reporter: Zhen Chen
>            Priority: Major
>              Labels: pull-request-available
>
> The calculation result of grouping function is wrong.
> in agg.iq result is
> {code:java}
> +--------+-----+-------+--------+-------------+----------+----------+
> | DEPTNO | JOB | SAL   | EXPR$3 | DEPTNO_FLAG | JOB_FLAG | SAL_FLAG |
> +--------+-----+-------+--------+-------------+----------+----------+
> |     10 | aa  |       |   1250 |           0 |        0 |        1 |
> |     10 | aa  |       |   1250 |           0 |        0 |        0 |
> |     10 |     |  7000 |        |           0 |        1 |        0 |
> |     10 |     | 10000 |   1000 |           0 |        1 |        0 |
> |     10 |     | 11500 |    250 |           0 |        1 |        0 |
> |     20 | aa  |       |    500 |           0 |        0 |        1 |
> |     20 | aa  |       |    500 |           0 |        0 |        0 |
> |     20 |     |  8000 |    500 |           0 |        1 |        0 |
> +--------+-----+-------+--------+-------------+----------+----------+
> (8 rows)
> {code}
> in pgsql result is
> {code:java}
>  deptno | job |  sal  | sum  | deptno_flag | job_flag | sal_flag 
> --------+-----+-------+------+-------------+----------+----------
>      10 | aa  |       | 1250 |           0 |        0 |        1
>      10 | aa  |       | 1250 |           0 |        0 |        1
>      10 |     |  7000 |      |           0 |        1 |        0
>      10 |     | 10000 | 1000 |           0 |        1 |        0
>      10 |     | 11500 |  250 |           0 |        1 |        0
>      20 | aa  |       |  500 |           0 |        0 |        1
>      20 | aa  |       |  500 |           0 |        0 |        1
>      20 |     |  8000 |  500 |           0 |        1 |        0
> (8 rows)
> {code}
> The sal_flag column has incorrect values in rows 2 and 7.
> SQL is 
> {code:java}
> WITH emps_data AS (
>   SELECT * FROM (VALUES
>     (100, 10, 'Bill', 10000, 1000, 'aa'),
>     (110, 10, 'Theodore', 11500, 250, 'aa'),
>     (150, 10, 'Sebastian', 7000, NULL, 'aa'),
>     (200, 20, 'Eric', 8000, 500, 'aa')
>   ) AS t(empno, deptno, ename, sal, comm, job)
> )
> SELECT deptno, job, sal, SUM(comm),
>        GROUPING(deptno) AS deptno_flag,
>        GROUPING(job) AS job_flag,
>        GROUPING(sal) AS sal_flag
> FROM emps_data
> GROUP BY GROUPING SETS ((deptno, job), (deptno, sal), (deptno, job))
> ORDER BY deptno, job, sal;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-7126) The calculation result of grouping function is wrong

Reply via email to