findepi commented on code in PR #12853:
URL: https://github.com/apache/datafusion/pull/12853#discussion_r1798843685
##
datafusion/common/src/types/logical.rs:
##
@@ -0,0 +1,58 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agree
findepi commented on code in PR #12853:
URL: https://github.com/apache/datafusion/pull/12853#discussion_r1798841086
##
datafusion/common/src/types/logical.rs:
##
@@ -0,0 +1,41 @@
+use core::fmt;
+use std::{cmp::Ordering, hash::Hash, sync::Arc};
+
+use super::NativeType;
+
+/// A
yoavcloud commented on PR #1467:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1467#issuecomment-2410086902
@iffyio I tried inlining the function, it's not as short as I wish it would,
due to pattern matching the boxed fields. But maybe I'm missing something...
Would love you
jonahgao commented on code in PR #12917:
URL: https://github.com/apache/datafusion/pull/12917#discussion_r1798789739
##
datafusion/physical-expr/src/math_expressions.rs:
##
@@ -1,126 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor l
jonahgao commented on code in PR #12917:
URL: https://github.com/apache/datafusion/pull/12917#discussion_r1798786852
##
datafusion/physical-expr/src/math_expressions.rs:
##
@@ -1,126 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor l
jonahgao commented on code in PR #12889:
URL: https://github.com/apache/datafusion/pull/12889#discussion_r1798785780
##
datafusion/physical-expr/src/math_expressions.rs:
##
@@ -17,60 +17,27 @@
//! Math expressions
-use std::any::type_name;
use std::sync::Arc;
-use arrow:
jonahgao opened a new pull request, #12917:
URL: https://github.com/apache/datafusion/pull/12917
## Which issue does this PR close?
N/A
## Rationale for this change
Found when reviewing #12889
It should be a leftover from moving math functions.
## What
Lordworms commented on PR #12781:
URL: https://github.com/apache/datafusion/pull/12781#issuecomment-2409938373
When I was implementing this , the hard part is how to collect global build
side information in partition-mode, currently my idea is to using a lock to
ensure all the build-side op
goldmedal commented on PR #12816:
URL: https://github.com/apache/datafusion/pull/12816#issuecomment-2409843953
> Here is the performance of this PR. Some queries are slower, some are
faster.
>
> I believe once we turn on string view everything will be faster.
>
Thanks @alam
jonahgao closed issue #12149: `octet_length()` function not working for
StringView columns (SQLancer)
URL: https://github.com/apache/datafusion/issues/12149
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
jonahgao merged PR #12900:
URL: https://github.com/apache/datafusion/pull/12900
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
jonahgao closed issue #12463: hooks for `temporary` tables in the catalog
URL: https://github.com/apache/datafusion/issues/12463
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jonahgao merged PR #12561:
URL: https://github.com/apache/datafusion/pull/12561
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
github-actions[bot] commented on PR #10584:
URL: https://github.com/apache/datafusion/pull/10584#issuecomment-2409696079
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] commented on PR #11897:
URL: https://github.com/apache/datafusion/pull/11897#issuecomment-2409695828
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] closed pull request #11794: Better multi-column aggregation
support with StringView
URL: https://github.com/apache/datafusion/pull/11794
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
github-actions[bot] commented on PR #11758:
URL: https://github.com/apache/datafusion/pull/11758#issuecomment-2409695918
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
andygrove merged PR #27:
URL: https://github.com/apache/datafusion-ray/pull/27
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafu
andygrove merged PR #31:
URL: https://github.com/apache/datafusion-ray/pull/31
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafu
peasee opened a new pull request, #12916:
URL: https://github.com/apache/datafusion/pull/12916
## Which issue does this PR close?
Closes #12915.
## Rationale for this change
* To fix a bug with casting integers in MySQL
## What changes are included
peasee opened a new issue, #12915:
URL: https://github.com/apache/datafusion/issues/12915
### Describe the bug
`CAST(.. AS INTEGER)` is not supported in MySQL, as `INTEGER` is not a valid
cast type.
### To Reproduce
Create an execution plan using the MySQL Dialect that i
austin362667 commented on code in PR #27:
URL: https://github.com/apache/datafusion-ray/pull/27#discussion_r1798588033
##
docs/README.md:
##
@@ -17,12 +17,12 @@
under the License.
-->
-# RaySQL Design Documentation
+# Datafusion Ray Design Documentation
Review Comment:
austin362667 commented on code in PR #27:
URL: https://github.com/apache/datafusion-ray/pull/27#discussion_r1798588305
##
docs/README.md:
##
@@ -260,13 +257,14 @@ child plans, building up a DAG of futures.
## Distributed Shuffle
-The output of each query stage needs to be p
edmondop opened a new pull request, #31:
URL: https://github.com/apache/datafusion-ray/pull/31
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe,
findepi opened a new pull request, #12914:
URL: https://github.com/apache/datafusion/pull/12914
Unsafe `StructArray::new_unchecked` is a more performant alternative to
`StructArray::new`. In test code there is no benefit from using the unsafe code
path.
--
This is an automated messag
lovasoa commented on code in PR #1474:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1474#discussion_r1798561445
##
src/parser/mod.rs:
##
@@ -9416,27 +9416,42 @@ impl<'a> Parser<'a> {
}
}
+fn parse_set_role(&mut self, modifier: Option) ->
Opti
lovasoa commented on PR #1474:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1474#issuecomment-2409117537
I changed the code to use maybe_parse, which is indeed much cleaner and
resolves both your points. Thanks @iffyio !
--
This is an automated message from the Apache Git
andygrove commented on issue #30:
URL: https://github.com/apache/datafusion-ray/issues/30#issuecomment-2409070032
Thanks. I hadn't understood that from reading the docs, but it makes sense
now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
Rachelint commented on PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#issuecomment-2409067977
@alamb 👍 Thanks for reminding about the test coverage.
After checking the codes again more carefully, I found some testcases indeed
don't coverage code paths as I expected.
andygrove commented on code in PR #29:
URL: https://github.com/apache/datafusion-ray/pull/29#discussion_r1798471260
##
.github/workflows/k8s.yml:
##
@@ -0,0 +1,32 @@
+name: Kubernetes
Review Comment:
I'll go ahead and merge so we can test. I added a bullet to
https://github
andygrove merged PR #29:
URL: https://github.com/apache/datafusion-ray/pull/29
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafu
eejbyfeldt opened a new pull request, #12913:
URL: https://github.com/apache/datafusion/pull/12913
## Which issue does this PR close?
Follow up to #12814
## Rationale for this change
This address a bug that previously always replace % 1 expression with a 0 of
type i3
timsaucer commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798465579
##
python/datafusion/dataframe.py:
##
@@ -163,7 +163,20 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
def with_columns(
self, *e
timsaucer commented on code in PR #908:
URL: https://github.com/apache/datafusion-python/pull/908#discussion_r1798465440
##
python/datafusion/dataframe.py:
##
@@ -175,7 +178,23 @@ def with_column_renamed(self, old_name: str, new_name:
str) -> DataFrame:
Returns:
vakarisbk commented on issue #30:
URL: https://github.com/apache/datafusion-ray/issues/30#issuecomment-2409037630
Cluster version is determined by the container image that is used to launch
the cluster (step 3 in the docs).
The default container image is `rayproject/ray:2.9.0`. The d
ion-elgreco commented on issue #12906:
URL: https://github.com/apache/datafusion/issues/12906#issuecomment-2409035766
@Omega359 indeed, for now I've opened a PR for a fill_null in
datafusion-python that calls nvl:
https://github.com/apache/datafusion-python/pull/919
--
This is an automat
ion-elgreco opened a new pull request, #919:
URL: https://github.com/apache/datafusion-python/pull/919
# Which issue does this PR close?
- related https://github.com/apache/datafusion-python/issues/875
# Rationale for this change
Fill_na/null is a quite common method for doing t
eejbyfeldt commented on code in PR #12902:
URL: https://github.com/apache/datafusion/pull/12902#discussion_r1798432714
##
datafusion/sqllogictest/test_files/errors.slt:
##
@@ -133,3 +133,7 @@ create table foo as values (1), ('foo');
query error No function matches
select 1 g
2010YOUY01 commented on code in PR #12888:
URL: https://github.com/apache/datafusion/pull/12888#discussion_r1798425953
##
datafusion/physical-plan/src/aggregates/row_hash.rs:
##
@@ -102,6 +102,19 @@ struct SpillState {
/// true when streaming merge is in progress
is_
mbutrovich commented on PR #987:
URL: https://github.com/apache/datafusion-comet/pull/987#issuecomment-2409027007
Merged in updated main, thanks for the quick fix!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
ion-elgreco commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798413292
##
python/datafusion/dataframe.py:
##
@@ -163,7 +163,20 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
def with_columns(
self,
ion-elgreco opened a new issue, #918:
URL: https://github.com/apache/datafusion-python/issues/918
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
Instead of having users explicitly create a sessioncontext, we could just
create one
andygrove commented on code in PR #29:
URL: https://github.com/apache/datafusion-ray/pull/29#discussion_r1798408562
##
.github/workflows/k8s.yml:
##
@@ -0,0 +1,32 @@
+name: Kubernetes
Review Comment:
I would also be okay with explicitly excluding GitHub workflows from RAT
c
timsaucer commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798407548
##
python/datafusion/dataframe.py:
##
@@ -163,7 +163,20 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
def with_columns(
self, *e
Omega359 opened a new pull request, #12912:
URL: https://github.com/apache/datafusion/pull/12912
## Which issue does this PR close?
Closes #12898
## Rationale for this change
This refactor is to eliminate the requirement for other function modules to
depend on th
ion-elgreco opened a new pull request, #917:
URL: https://github.com/apache/datafusion-python/pull/917
@timsaucer we could probably drop the rust code for from_pandas and
from_polars and keep those in Python as aliases but just call from_arrow or
even drop them. The only thing is we will ha
andygrove opened a new issue, #30:
URL: https://github.com/apache/datafusion-ray/issues/30
I followed the k8s docs and it set up a cluster using Ray 2.9.0. DataFusion
Ray requires Ray 2.37.0, so I think that we should add documentation explaining
how to upgrade.
--
This is an automated m
andygrove commented on code in PR #29:
URL: https://github.com/apache/datafusion-ray/pull/29#discussion_r1798403887
##
.github/workflows/k8s.yml:
##
@@ -0,0 +1,32 @@
+name: Kubernetes
Review Comment:
Could you add the ASF license header? It looks like we don't have the RAT
andygrove commented on code in PR #27:
URL: https://github.com/apache/datafusion-ray/pull/27#discussion_r1798402790
##
docs/README.md:
##
@@ -260,13 +257,14 @@ child plans, building up a DAG of futures.
## Distributed Shuffle
-The output of each query stage needs to be pers
andygrove commented on code in PR #27:
URL: https://github.com/apache/datafusion-ray/pull/27#discussion_r1798401864
##
docs/README.md:
##
@@ -17,12 +17,12 @@
under the License.
-->
-# RaySQL Design Documentation
+# Datafusion Ray Design Documentation
Review Comment:
ni
ion-elgreco commented on PR #916:
URL:
https://github.com/apache/datafusion-python/pull/916#issuecomment-2409012940
> Very nice addition, but it looks like this branch is on top of your other
one since it uses `with_columns` so we will need to merge that one before this.
Yess, we nee
juroberttyb commented on PR #12908:
URL: https://github.com/apache/datafusion/pull/12908#issuecomment-2409012291
PR updated with `get_pow_doc()` removed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
juroberttyb commented on code in PR #12908:
URL: https://github.com/apache/datafusion/pull/12908#discussion_r1798392093
##
datafusion/functions/src/math/monotonicity.rs:
##
@@ -218,14 +545,76 @@ pub fn sqrt_order(input: &[ExprProperties]) ->
Result {
}
}
+static DOCUMEN
ion-elgreco opened a new issue, #12911:
URL: https://github.com/apache/datafusion/issues/12911
Is there a better way we could do this? Maybe add something
upstream if necessary?
As I'm thinking of it, I don't know that this operation is necessarily well
defined. Just li
ion-elgreco commented on code in PR #915:
URL: https://github.com/apache/datafusion-python/pull/915#discussion_r1798390360
##
python/datafusion/dataframe.py:
##
@@ -223,6 +223,30 @@ def limit(self, count: int, offset: int = 0) -> DataFrame:
"""
return DataFrame
ion-elgreco commented on code in PR #915:
URL: https://github.com/apache/datafusion-python/pull/915#discussion_r1798390091
##
python/datafusion/dataframe.py:
##
@@ -223,6 +223,30 @@ def limit(self, count: int, offset: int = 0) -> DataFrame:
"""
return DataFrame
ion-elgreco commented on code in PR #914:
URL: https://github.com/apache/datafusion-python/pull/914#discussion_r1798388988
##
python/tests/test_dataframe.py:
##
@@ -259,6 +259,43 @@ def test_join():
assert table.to_pydict() == expected
+def test_join_on():
+ctx = Se
my-vegetable-has-exploded commented on PR #12754:
URL: https://github.com/apache/datafusion/pull/12754#issuecomment-2409003636

It seems the main cost is sorting.
--
This is an automated m
tlm365 opened a new pull request, #12910:
URL: https://github.com/apache/datafusion/pull/12910
## Rationale for this change
Using the `unary` functions allows faster processing by avoiding branching
on nulls.
## What changes are included in this PR?
- Apply `unary`
tlm365 commented on code in PR #12909:
URL: https://github.com/apache/datafusion/pull/12909#discussion_r1798363818
##
datafusion/functions/src/math/trunc.rs:
##
@@ -111,44 +111,66 @@ fn trunc(args: &[ArrayRef]) -> Result {
);
}
-//if only one arg then invoke
Omega359 commented on issue #12898:
URL: https://github.com/apache/datafusion/issues/12898#issuecomment-2408988141
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
Rachelint commented on PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#issuecomment-2408987937
> Thank you so much @Rachelint -- this looks so great. I found it well
commented, well structured, and well tested.
>
> cc @jayzhan211 your GroupColumn pattern is really work
simonvandel commented on code in PR #12909:
URL: https://github.com/apache/datafusion/pull/12909#discussion_r1798350152
##
datafusion/functions/src/math/trunc.rs:
##
@@ -111,44 +111,66 @@ fn trunc(args: &[ArrayRef]) -> Result {
);
}
-//if only one arg then in
Omega359 commented on issue #12907:
URL: https://github.com/apache/datafusion/issues/12907#issuecomment-2408986299
referenced in https://github.com/apache/datafusion-python/issues/875
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
Omega359 commented on issue #12906:
URL: https://github.com/apache/datafusion/issues/12906#issuecomment-2408983472
This seems like it may almost be an alias for
[nvl](https://datafusion.apache.org/user-guide/sql/scalar_functions_new.html#nvl)
or
[nvl2](https://datafusion.apache.org/user-gu
ion-elgreco commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798340584
##
python/datafusion/dataframe.py:
##
@@ -160,6 +160,40 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
"""
return DataFrame
ion-elgreco commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798340584
##
python/datafusion/dataframe.py:
##
@@ -160,6 +160,40 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
"""
return DataFrame
Omega359 commented on PR #12908:
URL: https://github.com/apache/datafusion/pull/12908#issuecomment-2408982151
Beyond the `pow` removal this PR LGTM 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
ion-elgreco commented on code in PR #908:
URL: https://github.com/apache/datafusion-python/pull/908#discussion_r1798339065
##
python/datafusion/dataframe.py:
##
@@ -175,7 +178,23 @@ def with_column_renamed(self, old_name: str, new_name:
str) -> DataFrame:
Returns:
Omega359 commented on code in PR #12908:
URL: https://github.com/apache/datafusion/pull/12908#discussion_r1798337489
##
datafusion/functions/src/math/monotonicity.rs:
##
@@ -218,14 +545,76 @@ pub fn sqrt_order(input: &[ExprProperties]) ->
Result {
}
}
+static DOCUMENTAT
timsaucer commented on code in PR #916:
URL: https://github.com/apache/datafusion-python/pull/916#discussion_r1798329026
##
python/datafusion/dataframe.py:
##
@@ -211,6 +245,19 @@ def sort(self, *exprs: Expr | SortExpr) -> DataFrame:
exprs_raw = [sort_or_default(expr) f
tlm365 opened a new pull request, #12909:
URL: https://github.com/apache/datafusion/pull/12909
## Rationale for this change
Same idea as https://github.com/apache/datafusion/pull/12881. Using the
`unary`/`binary` functions allow faster processing (most likely auto-vectorized
code) by avo
Rachelint commented on code in PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#discussion_r1798329004
##
datafusion/physical-plan/src/aggregates/group_values/group_column.rs:
##
@@ -376,6 +385,399 @@ where
}
}
+/// An implementation of [`GroupColumn`] for
timsaucer commented on code in PR #915:
URL: https://github.com/apache/datafusion-python/pull/915#discussion_r1798327582
##
python/datafusion/dataframe.py:
##
@@ -223,6 +223,30 @@ def limit(self, count: int, offset: int = 0) -> DataFrame:
"""
return DataFrame(s
Rachelint commented on code in PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#discussion_r1798325968
##
datafusion/physical-plan/src/aggregates/group_values/group_column.rs:
##
@@ -376,6 +385,399 @@ where
}
}
+/// An implementation of [`GroupColumn`] for
juroberttyb commented on issue #12867:
URL: https://github.com/apache/datafusion/issues/12867#issuecomment-2408973361
Hi @alamb, thank you for providing this opportunity for new comers to learn
more about the repo!
I have opened a PR, could you or someone help me review it when you ha
timsaucer commented on code in PR #914:
URL: https://github.com/apache/datafusion-python/pull/914#discussion_r1798323879
##
python/tests/test_dataframe.py:
##
@@ -259,6 +259,43 @@ def test_join():
assert table.to_pydict() == expected
+def test_join_on():
+ctx = Sess
alamb commented on issue #12821:
URL: https://github.com/apache/datafusion/issues/12821#issuecomment-2408972279
Thank you @tustvold -- that content is so good I made a PR to propose
putting it in the readme of arrow-rs:
https://github.com/apache/arrow-rs/pull/6554
--
This is an automate
juroberttyb opened a new pull request, #12908:
URL: https://github.com/apache/datafusion/pull/12908
## Which issue does this PR close?
Closes #12867.
## Rationale for this change
## What changes are included in this PR?
## Are these changes
timsaucer commented on code in PR #912:
URL: https://github.com/apache/datafusion-python/pull/912#discussion_r1798320329
##
python/datafusion/dataframe.py:
##
@@ -284,14 +324,41 @@ def join(
Args:
right: Other DataFrame to join with.
-join_key
timsaucer commented on code in PR #909:
URL: https://github.com/apache/datafusion-python/pull/909#discussion_r1798315618
##
python/datafusion/dataframe.py:
##
@@ -160,6 +160,40 @@ def with_column(self, name: str, expr: Expr) -> DataFrame:
"""
return DataFrame(s
alamb commented on code in PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#discussion_r1798291818
##
datafusion/physical-plan/src/aggregates/group_values/group_column.rs:
##
@@ -579,4 +986,208 @@ mod tests {
assert!(!builder.equal_to(4, &input_array, 4))
tustvold commented on issue #12821:
URL: https://github.com/apache/datafusion/issues/12821#issuecomment-2408966481
I found that LLVM is relatively good at vectorizing vertical operations
provided:
* There are no conditionals within the loop body
* You've been careful to avoid inlin
timsaucer commented on code in PR #908:
URL: https://github.com/apache/datafusion-python/pull/908#discussion_r1798309123
##
python/datafusion/dataframe.py:
##
@@ -175,7 +178,23 @@ def with_column_renamed(self, old_name: str, new_name:
str) -> DataFrame:
Returns:
alamb commented on issue #12821:
URL: https://github.com/apache/datafusion/issues/12821#issuecomment-2408961671
> Can we use nightly rust that enable std::simd for vectorization? Although
in arrow-rs, the simd code is rewritten with auto-vectorization, but when I
check the generated asm, I
alamb commented on PR #12850:
URL: https://github.com/apache/datafusion/pull/12850#issuecomment-2408959974
Thanks again @adriangb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
alamb merged PR #12850:
URL: https://github.com/apache/datafusion/pull/12850
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on issue #11682:
URL: https://github.com/apache/datafusion/issues/11682#issuecomment-2408959341
Update: we have enough of the pieces implemented thanks to @Rachelint and
@goldmedal and @jayzhan211 so I have hacked it together in a branch and am
now running the performance
alamb commented on PR #12816:
URL: https://github.com/apache/datafusion/pull/12816#issuecomment-2408955172
I reabased / squashed all the code in this branch so it would be easier to
pull in to test in https://github.com/apache/datafusion/pull/12092
--
This is an automated message from the
alamb commented on PR #12792:
URL: https://github.com/apache/datafusion/pull/12792#issuecomment-2408953006
🚀
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
alamb closed issue #6906: Implement fast min/max accumulator for binary /
strings (now it uses the slower path)
URL: https://github.com/apache/datafusion/issues/6906
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
alamb merged PR #12792:
URL: https://github.com/apache/datafusion/pull/12792
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
ion-elgreco opened a new pull request, #916:
URL: https://github.com/apache/datafusion-python/pull/916
# Which issue does this PR close?
- related https://github.com/apache/datafusion-python/issues/875
# Rationale for this change
A top level cast is very practical and a common p
alamb commented on PR #12092:
URL: https://github.com/apache/datafusion/pull/12092#issuecomment-2408950500
My plan is to pull the changes from the following PRs into this PR and rerun
the overall perf test
- [ ] https://github.com/apache/datafusion/pull/12792
- [ ] https://github.com
alamb commented on code in PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#discussion_r1798283664
##
datafusion/physical-plan/src/aggregates/group_values/group_column.rs:
##
@@ -376,6 +385,399 @@ where
}
}
+/// An implementation of [`GroupColumn`] for bin
alamb commented on code in PR #12809:
URL: https://github.com/apache/datafusion/pull/12809#discussion_r1798281370
##
datafusion/physical-plan/src/aggregates/group_values/group_column.rs:
##
@@ -376,6 +385,399 @@ where
}
}
+/// An implementation of [`GroupColumn`] for bin
ion-elgreco opened a new issue, #12907:
URL: https://github.com/apache/datafusion/issues/12907
### Is your feature request related to a problem or challenge?
Pivoting and unpivoting is a common use case for data scientists, this is
currently missing in the DF api.
### Describe
alamb commented on PR #12816:
URL: https://github.com/apache/datafusion/pull/12816#issuecomment-2408940698
Here is the performance of this PR. Some queries are slower, some are
faster.
I believe once we turn on string view everything will be faster.
```
---
alamb commented on code in PR #12792:
URL: https://github.com/apache/datafusion/pull/12792#discussion_r1798266004
##
datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/nulls.rs:
##
@@ -91,3 +100,105 @@ pub fn filtered_null_mask(
let opt_filter = opt_filt
ion-elgreco opened a new issue, #12906:
URL: https://github.com/apache/datafusion/issues/12906
### Is your feature request related to a problem or challenge?
I would like to be able to fill_nulls per col/expr and on the dataframe
level, akin to polars/pyspark/pandas
### Describ
1 - 100 of 129 matches
Mail list logo