Dandandan commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2796224178

   Thanks @UBarney, nice to see it is already documented.
   
   So to give another example, test with queries like the following:
   
   
https://github.com/apache/datafusion-comet/blob/fdaec64fd313954ae788413ec02cf652cfa570b9/spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala#L1052
   
   ```
   SELECT FIRST(col1), LAST(col1) FROM t
   SELECT FIRST(col1), LAST(col1), col3 FROM t GROUP BY col3
   ```
   
   **Will** break in the future whenever we change some internals that changes 
the order of batches / rows to appear in an aggregation function (not only 
depending on the order of which t is scanned.
   
   So a query `SELECT FIRST(col1), LAST(col1) FROM t`
   Can basically be seen as semantically equivalent to queries like the 
following queries (just giving some extreme examples to make a point):
   
   ```
   SELECT ANY_VALUE(col1), ANY_VALUE(col1) FROM t
   SELECT MIN(col1), MAX(col1) FROM t
   SELECT FIRST(col1), LAST(col1) FROM t ORDER BY RANDOM()
   SELECT FIRST(col1), LAST(col1) FROM t LIMIT 1
   ```
   
   > Returns the last element in an aggregation group according to the 
requested ordering. If no ordering is given, returns an arbitrary element from 
the group.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to