Dandandan commented on issue #15676: URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2796224178
Thanks @UBarney, nice to see it is already documented. So to give another example, test with queries like the following: https://github.com/apache/datafusion-comet/blob/fdaec64fd313954ae788413ec02cf652cfa570b9/spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala#L1052 ``` SELECT FIRST(col1), LAST(col1) FROM t SELECT FIRST(col1), LAST(col1), col3 FROM t GROUP BY col3 ``` **Will** break in the future whenever we change some internals that changes the order of batches / rows to appear in an aggregation function (not only depending on the order of which t is scanned. So a query `SELECT FIRST(col1), LAST(col1) FROM t` Can basically be seen as semantically equivalent to queries like the following queries (just giving some extreme examples to make a point): ``` SELECT ANY_VALUE(col1), ANY_VALUE(col1) FROM t SELECT MIN(col1), MAX(col1) FROM t SELECT FIRST(col1), LAST(col1) FROM t ORDER BY RANDOM() SELECT FIRST(col1), LAST(col1) FROM t LIMIT 1 ``` > Returns the last element in an aggregation group according to the requested ordering. If no ordering is given, returns an arbitrary element from the group. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org