Anupam Yadav created SPARK-57353:
------------------------------------

             Summary: [Analyzer++] GROUPING SETS/CUBE/ROLLUP with HAVING or 
ORDER BY crashes with SparkUnsupportedOperationException
                 Key: SPARK-57353
                 URL: https://issues.apache.org/jira/browse/SPARK-57353
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Anupam Yadav


With `spark.sql.analyzer.singlePassResolver.enabled=true`, queries using GROUP 
BY CUBE/ROLLUP/GROUPING SETS with HAVING or ORDER BY containing aggregate 
functions crash with:

{noformat}
org.apache.spark.SparkUnsupportedOperationException: 
[UNSUPPORTED_CALL.WITHOUT_SUGGESTION]
Cannot call the method "dataType$" of the class 
"org.apache.spark.sql.catalyst.expressions.BaseGroupingSets".
SQLSTATE: 0A000
{noformat}

The single-pass resolver path invokes `assertValidAggregation` which calls 
`checkValidGroupingExprs` on sort/filter expressions. This function accesses 
`.dataType` on `BaseGroupingSets` expressions (Cube/Rollup/GroupingSets), but 
these expressions throw from their `dataType` method because they are meant to 
be expanded before type resolution.

The legacy analyzer (default) handles all these correctly.

*Repro:*

{code:sql}
-- All three variants crash with singlePassResolver enabled:

-- Variant 1: CUBE + ORDER BY
SELECT a, b, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY CUBE(a, b) ORDER BY SUM(b);

-- Variant 2: ROLLUP + HAVING
SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY ROLLUP(a, b) HAVING SUM(b) > 25;

-- Variant 3: GROUPING SETS + ORDER BY
SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY GROUPING SETS ((a, b), (a), ()) ORDER BY SUM(b);
{code}

*Root cause:* `ExprUtils.checkValidGroupingExprs` (ExprUtils.scala:211) calls 
`.dataType` on `BaseGroupingSets` expressions before they have been expanded in 
the single-pass resolver path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to