Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-07-21 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-3098624432 I made a follow on PR to update the docs a bit: - https://github.com/apache/datafusion/pull/16846 This is so exciting -- This is an automated message from the Apache Git Ser

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-23 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2998502945 Thanks @alamb @berkaysynnada @kylebarron @ozankabak @Omega359 @paleolimbot for reviewing and suggestions 🚀 -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-23 Thread via GitHub
alamb merged PR #14837: URL: https://github.com/apache/datafusion/pull/14837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-23 Thread via GitHub
alamb commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2162340524 ## datafusion/core/src/physical_planner.rs: ## @@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner { let runtime_expr = self.cr

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-23 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2997613389 Thanks @goldmedal -- I will file some follow on tickets and then merge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-23 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2997099582 hi @alamb I have fixed the conflicts. If no more comments, I think we can merge it. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-19 Thread via GitHub
samuelcolvin commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2987279730 In case anyone is interested, I used async UDFs to implement SQL function support for datafusion, demo here - https://github.com/samuelcolvin/datafusion-sql-udfs. -- This is

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-18 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2985235955 > This would be extremely useful for us. @alamb please would this get merged 🙏 . I will work on this later today or on Friday -- This is an automated message from the Apache G

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-18 Thread via GitHub
samuelcolvin commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2983962469 https://github.com/apache/datafusion/blob/630aa7b0c7b44ea8e77f9e0d685bf79f2a3cd3bd/datafusion/core/src/execution/context/mod.rs#L1766 Needs an option for async UDFs I gues

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-18 Thread via GitHub
samuelcolvin commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2983627011 This would be extremely useful for us. @alamb please would this get merged 🙏 . -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-03 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935875166 🤖: Benchmark completed Details ``` group epic_async-udf main -

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-03 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935488103 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-06-03 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935116007 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-22 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2902263734 Unless I hear anything else I plan to merge this tomorrow and will file a follow on Epic for other tasks (docs / blogs / support in other types of plans0 -- This is an automated mess

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-21 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2898317659 Are there any remaining outstanding issues to merging this PR? If not, perhaps we can merge it and file an epic / ticket for filling out the remaining features. A blog pos

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-13 Thread via GitHub
goldmedal commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2087328465 ## datafusion-examples/examples/async_udf.rs: ## @@ -0,0 +1,256 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-13 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2877380478 > That's a use case, but there are others too. Maybe one runs a forecast model, which is a little too complicated to "embed" into the query engine. In that case, we may still want

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-12 Thread via GitHub
paleolimbot commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2873182441 Just a quick two cents that I quite like the approach in this PR. I've been keeping an eye on this for a use case not dissimilar to the "llm" use case, where we want to use an AP

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-12 Thread via GitHub
ozankabak commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872740234 > Maybe the usecaes is "send many rows of data to a remote LLM service" for example, That's a use case, but there are others too. Maybe one runs a forecast model, which is a

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-12 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872678278 > Can we extend this sort of an approach to UDAFs? Having two entirely separate mechanisms would not be great. The data for an `async` user defined aggregate function must come f

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-12 Thread via GitHub
ozankabak commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872558077 Can we extend this sort of an approach to UDAFs? Having two entirely separate mechanisms would not be great. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-12 Thread via GitHub
berkaysynnada commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2871247518 > 1. The async overhead (e.g. what it takes to make await vs a normal function) could be noticable, but maybe not that big a deal If the awaited task doesn't include any

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2870055758 > > > > How would that work going from sync -> async? For example: `1 = 2 OR 1 = call_llm_model_async()`. I imagine this would build something like `BinaryExpr(BinaryExpr(1, Eq, 2), Or

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
berkaysynnada commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869915327 > > > How would that work going from sync -> async? For example: `1 = 2 OR 1 = call_llm_model_async()`. I imagine this would build something like `BinaryExpr(BinaryExpr(1, Eq,

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
adriangb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869780441 > > How would that work going from sync -> async? For example: `1 = 2 OR 1 = call_llm_model_async()`. I imagine this would build something like `BinaryExpr(BinaryExpr(1, Eq, 2), Or,

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
berkaysynnada commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083486922 ## datafusion/core/src/physical_planner.rs: ## @@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner { let runtime_expr =

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
berkaysynnada commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869684912 > How would that work going from sync -> async? For example: `1 = 2 OR 1 = call_llm_model_async()`. I imagine this would build something like `BinaryExpr(BinaryExpr(1, Eq, 2),

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-11 Thread via GitHub
alamb commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083479793 ## datafusion/core/src/physical_planner.rs: ## @@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner { let runtime_expr = self.cr

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-10 Thread via GitHub
adriangb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869167967 > What if we just added a new method to the PhysicalExpr trait, like `evaluate_async()`? We could then call this from streams that might involve async work. The default implementati

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-10 Thread via GitHub
berkaysynnada commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083270057 ## datafusion/core/src/physical_planner.rs: ## @@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner { let runtime_expr =

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-04-23 Thread via GitHub
berkaysynnada commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2824434533 > > What's the status of this PR? > > It's ready to review. I'm still waiting for someone to help review it. Thanks @goldmedal. We'll need this as well, so let's re

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-04-22 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2821640679 > What's the status of this PR? It's ready to review. I'm still waiting for someone to help review it. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-04-22 Thread via GitHub
berkaysynnada commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2821209325 What's the status of this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-03-21 Thread via GitHub
goldmedal commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2008683504 ## datafusion/physical-expr/src/async_scalar_function.rs: ## @@ -0,0 +1,227 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-03-21 Thread via GitHub
Omega359 commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2007678119 ## datafusion/physical-expr/src/async_scalar_function.rs: ## @@ -0,0 +1,227 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-03-12 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2719259117 Thanks I'll put it on my list -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-03-11 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2716277325 @alamb Sorry for the late. This PR is ready for review now. I want to focus on `Projection` and `Filter`, which currently invoke the async UDF. After ensuring the approach makes

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-02-24 Thread via GitHub
alamb commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2678906575 😮 -- thanks @goldmedal -- I'll put this on my list of things to review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu