Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]

via GitHub Wed, 16 Apr 2025 14:51:46 -0700


dtenedor commented on code in PR #50606:
URL: https://github.com/apache/spark/pull/50606#discussion_r2047817277



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala:
##########
@@ -446,11 +447,15 @@ package object dsl {
       def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, 
logicalPlan)
 
       def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): 
LogicalPlan = {
+        val groupingExprsWithReplacedOrdinals = groupingExprs.map {

Review Comment:
   can you please add a comment here saying what this part is doing?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -1825,24 +1825,32 @@ class AstBuilder extends DataTypeAstBuilder
           }
           visitNamedExpression(n)
         }.toSeq
+      val groupByExpressionsWithReplacedOrdinals =
+        replaceOrdinalsInGroupingExpressions(groupByExpressions)
       if (ctx.GROUPING != null) {
         // GROUP BY ... GROUPING SETS (...)
         // `groupByExpressions` can be non-empty for Hive compatibility. It 
may add extra grouping
         // expressions that do not exist in GROUPING SETS (...), and the value 
is always null.
         // For example, `SELECT a, b, c FROM ... GROUP BY a, b, c GROUPING 
SETS (a, b)`, the output
         // of column `c` is always null.
         val groupingSets =
-          ctx.groupingSet.asScala.map(_.expression.asScala.map(e => 
expression(e)).toSeq)
-        Aggregate(Seq(GroupingSets(groupingSets.toSeq, groupByExpressions)),
+          ctx.groupingSet.asScala.map(_.expression.asScala.map(e => {

Review Comment:
   can you please add a comment here saying what this part is doing?



##########
sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/AggregateResolverSuite.scala:
##########
@@ -44,12 +44,6 @@ class AggregateResolverSuite extends QueryTest with 
SharedSparkSession {
     resolverRunner.resolve(query)

Review Comment:
   Can you copy these test contents to the Jira so we don't forget?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -6558,6 +6573,31 @@ class AstBuilder extends DataTypeAstBuilder
     }
   }
 
+  private def visitSortItemAndReplaceOrdinals(sortItemContext: 
SortItemContext) = {

Review Comment:
   can you please add a comment here saying what these new methods are doing?



##########
sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala:
##########
@@ -929,7 +929,16 @@ class Dataset[T] private[sql](
   /** @inheritdoc */
   @scala.annotation.varargs
   def groupBy(cols: Column*): RelationalGroupedDataset = {
-    RelationalGroupedDataset(toDF(), cols.map(_.expr), 
RelationalGroupedDataset.GroupByType)
+    val groupingExpressionsWithReplacedOrdinals = cols.map { col => col.expr 
match {

Review Comment:
   can you please add a comment here saying what this part is doing?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]

Reply via email to