cloud-fan commented on code in PR #54297:
URL: https://github.com/apache/spark/pull/54297#discussion_r2828558617
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -449,9 +449,7 @@ trait CheckAnalysis extends LookupCatalog with
QueryErrorsBase with PlanToString
messageParameters = Map("funcName" -> toSQLExpr(w)))
case agg @ AggregateExpression(listAgg: ListAgg, _, _, _, _)
- if agg.isDistinct && listAgg.needSaveOrderValue =>
- throw
QueryCompilationErrors.functionAndOrderExpressionMismatchError(
- listAgg.prettyName, listAgg.child, listAgg.orderExpressions)
+ if agg.isDistinct => listAgg.validateDistinctOrderCompatibility()
Review Comment:
I don't like this approach as it's unclear what happens next if we don't
fail here. Does the DISTINCT execution path save the order value?
Even if we add comments here, it's making an assumption of the physical
execution path that is far away from here.
I still prefer my previous proposal: we can replace the order value
expression of `ListAgg` to a different but order-preserving expression (certain
CAST). It needs to happen before `CheckAnalysis`, so we can add a new analyzer
rule to do it. For the new single-pass analyzer, this rewrite should happen
after we fully resolve `ListAgg`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]