alamb commented on issue #14991:
URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2722665262

   > Specifically generated_id is in sorted order and since it is the only 
group column for the aggregation I would expect ordering_mode=Sorted in the 
AggregateExec. For example, if I change the values/agg function and remove the 
unnest, this ordering is preserved:
   
   I think the reason this plan can't use `ordering_mode=Sorted ` is that the 
LazyMemoryExec doesn't report that its output is sorted:
   
   
https://github.com/apache/datafusion/blob/ce536c9190f37533bd59754acb8effa6974873e6/datafusion/physical-plan/src/memory.rs#L158-L157
   
   I think if we reported that the output of LazyMemoryExec was sorted (via 
EquivalenceProperties) the correct aggregate would be used.
   
   @asubiotto  this is a good improvement, but I think orthogonal to the issue 
described in this ticket. Can you file a separate ticket for your usecase.
   
   RElatedly, @akurmustafa / @mustafasrepo  and I just wrote a blog about 
ordering equivalences and how they work in DataFusion: 
https://datafusion.apache.org/blog/2025/03/11/ordering-analysis/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to