alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2870055758
> > > > How would that work going from sync -> async? For example: `1 = 2 OR 1 = call_llm_model_async()`. I imagine this would build something like `BinaryExpr(BinaryExpr(1, Eq, 2), Or, ScalarFunc(call_llm_model_async))`. If we call `evaluate_async` on the outer `BinaryExpr` it would call `evaluate()` by default so now you're in sync world. How do you break back into async world? Do we pass around a handle to the tokio runtime? > > > > > > > > > Easy answer is converting original `evaluate()`'s to async, and move all `evalute()` impls to `evaluate_sync()`, but I cannot fully estimate its effects and challenges. Any comes to your mind? > > > > > > I mean that makes sense but sounds like a lot of churn? I'm not sure tbh sync / async coloring is always a pain and I don't know of any good solutions :( > > I'll try a POC when I find some time, and wonder @alamb's opinion My feeling (without any solid data) is that using `async` functions is not ideal because: 1. The async overhead (e.g. what it takes to make `await` vs a normal function) could be noticable, but maybe not that big a deal 2. The fact that everything that calls UDF would have to be async (as only async functions can call other async functions) -- the so called "what color are your functions" problem -- we be quite disruptive. Another benefit of the approach in this PR is that it requires no changes to any existing functions or APIs (in fact the original POC can be implemented entirely as a DataFusion user defined optimizer extension) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org