andygrove commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2626014384
I almost started a conversation about this but held back. Moving this crate upstream has a lot of value, and I support doing so. However, assuming that most DataFusion contributors would be unhappy about running Apache Spark as part of the test suite (I think this is a safe assumption), we need a testing plan. In Comet, we rely on integration tests that run Spark with and without Comet enabled and compare results. We also run a subset of Spark's test suite with Comet enabled. If we move the spark-expr crate into the main DataFusion repository, we will have to rely more on unit testing to demonstrate correct behavior. This is not a bad thing, of course, but it makes reviews much harder. There is an increased risk of PRs being accepted that are not fully compatible with Spark, and we won't find out in Comet until we upgrade to the next DataFusion version and run the Spark tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org