alamb commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2625779458
> I love the idea of collaborating on Spark compatible `UDF`s. > > As of writing, `243/402` Spark functions doc-tests pass on Sail. We haven't focused on performance yet and instead have been focusing on just knocking all of them out because there are so many of them. Our implementations can be found: > > * https://github.com/lakehq/sail/tree/main/crates/sail-plan/src/function > * https://github.com/lakehq/sail/tree/main/crates/sail-plan/src/extension/function > Nice! > I will say that we have encountered numerous problems relying on downstream DataFusion-based crates,...The issue isn't with the crates themselves but arises when it's time to upgrade DataFusion versions, requiring us to wait for each crate to update and release a new version. Indeed -- I think the key challenge is mustering enough maintainer bandwidth across the crates to keep them up to date / updated quickly. > We haven't done as good a job as [@andygrove](https://github.com/andygrove) and the Comet folks with documenting what we do and don't support (we're currently a small team of just two people). However, we do have test reports for every pull request to track coverage. It sounds prretty amazing. There is a tradeoff between moving fast in the short term and building up capacity in the longer term to maintain it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org