Anupam Yadav created SPARK-57353:
------------------------------------
Summary: [Analyzer++] GROUPING SETS/CUBE/ROLLUP with HAVING or
ORDER BY crashes with SparkUnsupportedOperationException
Key: SPARK-57353
URL: https://issues.apache.org/jira/browse/SPARK-57353
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.0.0
Reporter: Anupam Yadav
With `spark.sql.analyzer.singlePassResolver.enabled=true`, queries using GROUP
BY CUBE/ROLLUP/GROUPING SETS with HAVING or ORDER BY containing aggregate
functions crash with:
{noformat}
org.apache.spark.SparkUnsupportedOperationException:
[UNSUPPORTED_CALL.WITHOUT_SUGGESTION]
Cannot call the method "dataType$" of the class
"org.apache.spark.sql.catalyst.expressions.BaseGroupingSets".
SQLSTATE: 0A000
{noformat}
The single-pass resolver path invokes `assertValidAggregation` which calls
`checkValidGroupingExprs` on sort/filter expressions. This function accesses
`.dataType` on `BaseGroupingSets` expressions (Cube/Rollup/GroupingSets), but
these expressions throw from their `dataType` method because they are meant to
be expanded before type resolution.
The legacy analyzer (default) handles all these correctly.
*Repro:*
{code:sql}
-- All three variants crash with singlePassResolver enabled:
-- Variant 1: CUBE + ORDER BY
SELECT a, b, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY CUBE(a, b) ORDER BY SUM(b);
-- Variant 2: ROLLUP + HAVING
SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY ROLLUP(a, b) HAVING SUM(b) > 25;
-- Variant 3: GROUPING SETS + ORDER BY
SELECT a, SUM(b) FROM VALUES (1,10),(1,20),(2,30) AS t(a,b)
GROUP BY GROUPING SETS ((a, b), (a), ()) ORDER BY SUM(b);
{code}
*Root cause:* `ExprUtils.checkValidGroupingExprs` (ExprUtils.scala:211) calls
`.dataType` on `BaseGroupingSets` expressions before they have been expanded in
the single-pass resolver path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]