nuno-faria commented on PR #17058: URL: https://github.com/apache/datafusion/pull/17058#issuecomment-3171931826
Thanks @berkaysynnada for the pointers. I did a little more digging and found that the `get_finer_aggregate_exprs_requirement` function returns a different value for `string_agg` and `array_agg`. The source of the difference is that `AggregateUDFImpl::reverse_expr` is not defined for `string_agg`. This means the `SortExec` will not be created for `string_agg`, and that is still true in the main branch of datafusion. ``` # string_agg | physical_plan after CombinePartialFinalAggregate | OutputRequirementExec: order_by=[], dist_by=Unspecified | | AggregateExec: mode=Single, gby=[], aggr=[string_agg(t.k,Utf8(",")) ORDER BY [t.v ASC NULLS LAST]] | | DataSourceExec: partitions=1, partition_sizes=[1] | | | physical_plan after EnforceSorting | SAME TEXT AS ABOVE # array_agg | physical_plan after CombinePartialFinalAggregate | OutputRequirementExec: order_by=[], dist_by=Unspecified | | AggregateExec: mode=Single, gby=[], aggr=[array_agg(t.k) ORDER BY [t.v ASC NULLS LAST]] | | DataSourceExec: partitions=1, partition_sizes=[1] | | | physical_plan after EnforceSorting | OutputRequirementExec: order_by=[], dist_by=Unspecified | | AggregateExec: mode=Single, gby=[], aggr=[array_agg(t.k) ORDER BY [t.v ASC NULLS LAST]] | | SortExec: expr=[v@1 ASC NULLS LAST], preserve_partitioning=[false] | | DataSourceExec: partitions=1, partition_sizes=[1] ``` The fix is to simply implement `AggregateUDFImpl::reverse_expr` for `StringAgg`. @alamb @findepi what do you think of this approach? With it there is no need cherry pick a new feature into `49.0.1`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org