alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-3098624432
I made a follow on PR to update the docs a bit:
- https://github.com/apache/datafusion/pull/16846
This is so exciting
--
This is an automated message from the Apache Git Ser
goldmedal commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2998502945
Thanks @alamb @berkaysynnada @kylebarron @ozankabak @Omega359 @paleolimbot
for reviewing and suggestions 🚀
--
This is an automated message from the Apache Git Service.
To respon
alamb merged PR #14837:
URL: https://github.com/apache/datafusion/pull/14837
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2162340524
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
self.cr
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2997613389
Thanks @goldmedal -- I will file some follow on tickets and then merge
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
goldmedal commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2997099582
hi @alamb
I have fixed the conflicts. If no more comments, I think we can merge it.
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
samuelcolvin commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2987279730
In case anyone is interested, I used async UDFs to implement SQL function
support for datafusion, demo here -
https://github.com/samuelcolvin/datafusion-sql-udfs.
--
This is
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2985235955
> This would be extremely useful for us. @alamb please would this get merged
🙏 .
I will work on this later today or on Friday
--
This is an automated message from the Apache G
samuelcolvin commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2983962469
https://github.com/apache/datafusion/blob/630aa7b0c7b44ea8e77f9e0d685bf79f2a3cd3bd/datafusion/core/src/execution/context/mod.rs#L1766
Needs an option for async UDFs I gues
samuelcolvin commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2983627011
This would be extremely useful for us. @alamb please would this get merged 🙏
.
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935875166
🤖: Benchmark completed
Details
```
group epic_async-udf
main
-
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935488103
🤖 `./gh_compare_branch_bench.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2935116007
🤖 `./gh_compare_branch_bench.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2902263734
Unless I hear anything else I plan to merge this tomorrow and will file a
follow on Epic for other tasks (docs / blogs / support in other types of plans0
--
This is an automated mess
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2898317659
Are there any remaining outstanding issues to merging this PR?
If not, perhaps we can merge it and file an epic / ticket for filling out
the remaining features.
A blog pos
goldmedal commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2087328465
##
datafusion-examples/examples/async_udf.rs:
##
@@ -0,0 +1,256 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license
goldmedal commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2877380478
> That's a use case, but there are others too. Maybe one runs a forecast
model, which is a little too complicated to "embed" into the query engine. In
that case, we may still want
paleolimbot commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2873182441
Just a quick two cents that I quite like the approach in this PR. I've been
keeping an eye on this for a use case not dissimilar to the "llm" use case,
where we want to use an AP
ozankabak commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872740234
> Maybe the usecaes is "send many rows of data to a remote LLM service" for
example,
That's a use case, but there are others too. Maybe one runs a forecast
model, which is a
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872678278
> Can we extend this sort of an approach to UDAFs? Having two entirely
separate mechanisms would not be great.
The data for an `async` user defined aggregate function must come f
ozankabak commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2872558077
Can we extend this sort of an approach to UDAFs? Having two entirely
separate mechanisms would not be great.
--
This is an automated message from the Apache Git Service.
To respo
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2871247518
> 1. The async overhead (e.g. what it takes to make await vs a normal
function) could be noticable, but maybe not that big a deal
If the awaited task doesn't include any
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2870055758
> > > > How would that work going from sync -> async? For example: `1 = 2 OR
1 = call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2), Or
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869915327
> > > How would that work going from sync -> async? For example: `1 = 2 OR 1
= call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq,
adriangb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869780441
> > How would that work going from sync -> async? For example: `1 = 2 OR 1 =
call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2), Or,
berkaysynnada commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083486922
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869684912
> How would that work going from sync -> async? For example: `1 = 2 OR 1 =
call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2),
alamb commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083479793
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
self.cr
adriangb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869167967
> What if we just added a new method to the PhysicalExpr trait, like
`evaluate_async()`? We could then call this from streams that might involve
async work. The default implementati
berkaysynnada commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083270057
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2824434533
> > What's the status of this PR?
>
> It's ready to review. I'm still waiting for someone to help review it.
Thanks @goldmedal. We'll need this as well, so let's re
goldmedal commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2821640679
> What's the status of this PR?
It's ready to review. I'm still waiting for someone to help review it.
--
This is an automated message from the Apache Git Service.
To r
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2821209325
What's the status of this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
goldmedal commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2008683504
##
datafusion/physical-expr/src/async_scalar_function.rs:
##
@@ -0,0 +1,227 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contrib
Omega359 commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2007678119
##
datafusion/physical-expr/src/async_scalar_function.rs:
##
@@ -0,0 +1,227 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contribu
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2719259117
Thanks I'll put it on my list
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
goldmedal commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2716277325
@alamb Sorry for the late. This PR is ready for review now.
I want to focus on `Projection` and `Filter`, which currently invoke the
async UDF. After ensuring the approach makes
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2678906575
😮 -- thanks @goldmedal -- I'll put this on my list of things to review
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
38 matches
Mail list logo