mihailoale-db opened a new pull request, #50791: URL: https://github.com/apache/spark/pull/50791
### What changes were proposed in this pull request? In this PR I propose that we change `.toString` to `toPrettySQL` when constructing grouping expressions in `ResolveGroupingAnalytics` rule. ### Why are the changes needed? Right now following query would pass (`#x` and `#y` are expression IDs generated with every cluster start): `select * from values(1,2) group by grouping sets (col1,col2,col1+col2) order by `(col1#x + col2#y)`` But with next cluster restart, expression IDs would be regenerated and the query would fail. Because of that we need to fix this to disallow this nondeterministic behavior. ### Does this PR introduce _any_ user-facing change? Some queries (and Dataframe programs) are going to fail but they would fail with every cluster restart (as explained above). ### How was this patch tested? Added tests. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org