cloud-fan commented on code in PR #54297:
URL: https://github.com/apache/spark/pull/54297#discussion_r2828558617


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -449,9 +449,7 @@ trait CheckAnalysis extends LookupCatalog with 
QueryErrorsBase with PlanToString
               messageParameters = Map("funcName" -> toSQLExpr(w)))
 
           case agg @ AggregateExpression(listAgg: ListAgg, _, _, _, _)
-            if agg.isDistinct && listAgg.needSaveOrderValue =>
-            throw 
QueryCompilationErrors.functionAndOrderExpressionMismatchError(
-              listAgg.prettyName, listAgg.child, listAgg.orderExpressions)
+            if agg.isDistinct => listAgg.validateDistinctOrderCompatibility()

Review Comment:
   I don't like this approach as it's unclear what happens next if we don't 
fail here. Does the DISTINCT execution path save the order value?
   
   Even if we add comments here, it's making an assumption of the physical 
execution path that is far away from here.
   
   I still prefer my previous proposal: we can replace the order value 
expression of `ListAgg` to a different but order-preserving expression (certain 
CAST). It needs to happen before `CheckAnalysis`, so we can add a new analyzer 
rule to do it. For the new single-pass analyzer, this rewrite should happen 
after we fully resolve `ListAgg`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to