andygrove commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2796920220

   Sorry to cause so much work for everyone discussing this 😞 
   
   For order-preserved input, such as single-partition & single-thread, 
DataFusion has implemented the same behavior as Spark for the past ~1 year. In 
the past week, the behavior changed. It isn't necessarily "wrong," but it is a 
breaking change for some downstream users. Comet's test suites run the same 
query against Spark and Comet/DataFusion and compare the results. The tests are 
mostly deterministic. Upgrading to the soon-to-be DataFusion 47 causes test 
failures, hence I reported this issue.
   
   In the Comet project, we now have the following choices:
   
   - 1. Rewrite our tests for `LAST` to stop comparing to Spark and implement 
some other means to determine that the behavior is correct, and also document 
that Comet is not compatible with Spark in some cases
   - 2. Fork the `LAST` implementation and maintain it in Comet
   - 3. See if there are options for DataFusion to support the order-preserved 
case
   
   Any one of these options can work.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to