alamb merged PR #16156:
URL: https://github.com/apache/datafusion/pull/16156
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb merged PR #16175:
URL: https://github.com/apache/datafusion/pull/16175
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
blaginin commented on PR #14684:
URL: https://github.com/apache/datafusion/pull/14684#issuecomment-2906801096
> @blaginin I see that https://github.com/apache/datafusion/pull/14781 is
still draft, this PR stillok otherwise?
yes, thanks for the reminder - there was some work happening
chenkovsky commented on PR #16161:
URL: https://github.com/apache/datafusion/pull/16161#issuecomment-2906801006
@eejbyfeldt could you please help me review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
duongcongtoai commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105798200
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contrib
logan-keede commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105831888
##
datafusion/optimizer/src/eliminate_cross_join.rs:
##
@@ -351,6 +353,8 @@ fn find_inner_join(
join_type: JoinType::Inner,
join_constraint:
dependabot[bot] commented on PR #1125:
URL:
https://github.com/apache/datafusion-python/pull/1125#issuecomment-2906979041
Superseded by #1134.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dependabot[bot] opened a new pull request, #1135:
URL: https://github.com/apache/datafusion-python/pull/1135
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.16.0 to 1.17.0.
Release notes
Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases.
v1.17.0
dependabot[bot] opened a new pull request, #1134:
URL: https://github.com/apache/datafusion-python/pull/1134
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.44.2 to 1.45.1.
Release notes
Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases.
Tokio
codecov-commenter commented on PR #1788:
URL:
https://github.com/apache/datafusion-comet/pull/1788#issuecomment-2906975530
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1788?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
dependabot[bot] closed pull request #1125: build(deps): bump tokio from 1.44.2
to 1.45.0
URL: https://github.com/apache/datafusion-python/pull/1125
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
comphead opened a new issue, #1789:
URL: https://github.com/apache/datafusion-comet/issues/1789
`map_values` will be addressed in following PR. It got a
correctness issue
```
- read map[struct, struct] from parquet *** FAILED *** (1 second, 478
milliseconds)
Res
comphead commented on PR #1788:
URL:
https://github.com/apache/datafusion-comet/pull/1788#issuecomment-2907578088
`map_values` will be addressed in following PR. It got a correctness issue
```
- read map[struct, struct] from parquet *** FAILED *** (1 second, 478
milliseconds)
lifan-ake opened a new pull request, #16184:
URL: https://github.com/apache/datafusion/pull/16184
## Which issue does this PR close?
- Closes #15792 .
## Rationale for this change
## What changes are included in this PR?
migrate `logical_plan` t
lifan-ake commented on PR #16184:
URL: https://github.com/apache/datafusion/pull/16184#issuecomment-2907587385
Hi @alamb and @blaginin , I'am trying to solve this issue #15792 .
This PR is ready to review, please take a look once you have time.
In my opinion, the doc should be e
alamb opened a new issue, #16180:
URL: https://github.com/apache/datafusion/issues/16180
### Describe the bug
The extended tests are failing intermittently on main (mostly pass, but
sometimes fail)
Here is an example failure:
https://github.com/apache/datafusion/actions/
alamb commented on code in PR #16167:
URL: https://github.com/apache/datafusion/pull/16167#discussion_r2105792781
##
datafusion/functions-nested/src/length.rs:
##
@@ -128,26 +148,20 @@ pub fn array_length_inner(args: &[ArrayRef]) ->
Result {
match &args[0].data_type() {
alamb commented on PR #16161:
URL: https://github.com/apache/datafusion/pull/16161#issuecomment-2906786682
Thanks @chenkovsky -- can. you find the original PR that added this
`GROUPING` function and perhaps @ mention the author to see if they have any
feedback / could help with review?
-
alamb merged PR #16138:
URL: https://github.com/apache/datafusion/pull/16138
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
agis opened a new issue, #1859:
URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1859
Given the following input:
```sql
ALTER TABLE logs DROP COLUMN details
```
the following code outputs the statement as:
```sql
ALTER TABLE logs DROP details
```
alamb commented on code in PR #16165:
URL: https://github.com/apache/datafusion/pull/16165#discussion_r2105793257
##
datafusion/physical-plan/src/aggregates/mod.rs:
##
@@ -57,6 +57,10 @@ mod row_hash;
mod topk;
mod topk_stream;
+/// Hard-coded seed for aggregations to ensure
alamb commented on PR #15700:
URL: https://github.com/apache/datafusion/pull/15700#issuecomment-2906787792
@2010YOUY01 and @ding-young I wonder if you can review this PR again to help
@rluvaton get it merged?
Specifically if it needs more tests perhaps you can help identify which are
alamb commented on PR #16138:
URL: https://github.com/apache/datafusion/pull/16138#issuecomment-2906783551
Thanks again @atahanyorganci
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
alamb commented on PR #16165:
URL: https://github.com/apache/datafusion/pull/16165#issuecomment-2906808852
🤖: Benchmark completed
Details
```
Comparing HEAD and fix_aggregation-seed
Benchmark clickbench_extended.json
andygrove commented on PR #1773:
URL:
https://github.com/apache/datafusion-comet/pull/1773#issuecomment-2906822147
> I have rebased this branch onto the latest main. The [ubuntu-latest/java
17-spark-4.0/java](https://github.com/apache/datafusion-comet/actions/runs/15222051464/job/4281914033
onlyjackfrost commented on PR #16181:
URL: https://github.com/apache/datafusion/pull/16181#issuecomment-2906858277
@alamb, could you help review this PR?
I have a general understanding of it, but I haven't fully grasped all the
details yet.
So I only added the diagram and provided
rluvaton commented on PR #15700:
URL: https://github.com/apache/datafusion/pull/15700#issuecomment-2906947109
> > @2010YOUY01 and @ding-young I wonder if you can review this PR again to
help @rluvaton get it merged?
>
> >
>
> > Specifically if it needs more tests perhaps you c
Dandandan merged PR #16153:
URL: https://github.com/apache/datafusion/pull/16153
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@data
andygrove opened a new pull request, #1787:
URL: https://github.com/apache/datafusion-comet/pull/1787
## Which issue does this PR close?
Closes #.
## Rationale for this change
## What changes are included in this PR?
## How are these changes
gabotechs opened a new pull request, #16183:
URL: https://github.com/apache/datafusion/pull/16183
## Which issue does this PR close?
It probably does not fully closes it but it partially addresses:
- https://github.com/apache/datafusion/issues/15069
## Rationale for t
onlyjackfrost commented on PR #16181:
URL: https://github.com/apache/datafusion/pull/16181#issuecomment-2907602002
@comphead, I'm not sure what you mean by overlap.
The diagram should like this
https://github.com/user-attachments/assets/4cff6c5d-892b-47a0-8c36-9bee8ae8cca3";
/>
andygrove merged PR #1773:
URL: https://github.com/apache/datafusion-comet/pull/1773
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@
UBarney commented on code in PR #15954:
URL: https://github.com/apache/datafusion/pull/15954#discussion_r2105771450
##
datafusion/physical-plan/src/aggregates/mod.rs:
##
@@ -733,13 +733,33 @@ impl AggregateExec {
&self.input_order_mode
}
-fn statistics_inner(
Dandandan commented on PR #16153:
URL: https://github.com/apache/datafusion/pull/16153#issuecomment-2906697087
> > This optimization is neat and already covers the common case of joins on
primary keys. I think we can further optimize the join hash table - even for
cases where _some_ keys mi
irenjj commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105849608
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor li
andygrove commented on issue #15771:
URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2906874047
Since updating Comet to use latest DataFusion (pinned dependency), we have
been seeing regular but intermittent CI failures that we are still trying to
debug.
It may or
andygrove closed issue #1615: Q23 fails when running TPC-DS SF=1 because of
invalid offset buffer being exported for empty StringArray.
URL: https://github.com/apache/datafusion-comet/issues/1615
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
jsai28 commented on PR #16079:
URL: https://github.com/apache/datafusion/pull/16079#issuecomment-2907011742
@alamb I think this is ready for review. I converted all of the tests in
`params.rs` that included an sql statement, expected types, and param values.
--
This is an automated messa
duongcongtoai commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105871923
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contrib
ding-young opened a new pull request, #16182:
URL: https://github.com/apache/datafusion/pull/16182
## Which issue does this PR close?
- Closes #16160 .
## TODO
- [ ] print the ids of failed queries in tpch, sort_tpch.
- [ ] run locally and check the output
- [
irenjj commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105841172
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor li
duongcongtoai commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105839452
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contrib
duongcongtoai commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105839452
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contrib
irenjj commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105841172
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor li
irenjj commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105845507
##
datafusion/optimizer/src/eliminate_cross_join.rs:
##
@@ -351,6 +353,8 @@ fn find_inner_join(
join_type: JoinType::Inner,
join_constraint: JoinC
andygrove opened a new issue, #1786:
URL: https://github.com/apache/datafusion-comet/issues/1786
### Describe the bug
Since changing the DataFusion dependency to a git dependency on a pinned
revision of DataFusion in https://github.com/apache/datafusion-comet/pull/1710
we have been e
leoyvens commented on issue #15363:
URL: https://github.com/apache/datafusion/issues/15363#issuecomment-2906881168
A backwards compatibility issue to consider is that `.` is currently valid
in udf identifiers. I currently abuse this to emulate schema namespacing for
UDFs even though DataFus
alamb commented on PR #15022:
URL: https://github.com/apache/datafusion/pull/15022#issuecomment-2906788114
Thanks @rluvaton - I will try and find time to review this over the next
day or two
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
alamb commented on PR #16165:
URL: https://github.com/apache/datafusion/pull/16165#issuecomment-2906787528
🤖 `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on issue #5492:
URL: https://github.com/apache/datafusion/issues/5492#issuecomment-2906789714
PLease let me know when you have PRs ready for review or other items that I
can try and help with
--
This is an automated message from the Apache Git Service.
To respond to the me
alamb commented on issue #16106:
URL: https://github.com/apache/datafusion/issues/16106#issuecomment-2906792842
Thanks @aditanase -- in general I would classify this under the category of
the desire for a more sophisticated join reordering algorithm. I am pretty
skeptical that we will be a
jfahne commented on issue #16120:
URL: https://github.com/apache/datafusion/issues/16120#issuecomment-2906960544
Okay so I dug through it and found the error is coming from the following
call chain:
- The `LogicalPlanBuilder` returned by the `parse_join` calls `join_using`
on the bui
andygrove closed pull request #1787: chore: Specify JVM heap size when running
scalatest
URL: https://github.com/apache/datafusion-comet/pull/1787
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
jonathanc-n commented on issue #16179:
URL: https://github.com/apache/datafusion/issues/16179#issuecomment-2906971884
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
cj-zhukov commented on code in PR #16104:
URL: https://github.com/apache/datafusion/pull/16104#discussion_r210602
##
datafusion/common/src/array_conversion.rs:
##
@@ -0,0 +1,145 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license
ding-young commented on PR #15700:
URL: https://github.com/apache/datafusion/pull/15700#issuecomment-2906803554
@alamb Sure! I may not be able to provide a detailed review right away, but
I can definitely help by running the tests added in the PR locally and looking
into memory accounting f
chenkovsky commented on code in PR #16167:
URL: https://github.com/apache/datafusion/pull/16167#discussion_r2105801259
##
datafusion/functions-nested/src/length.rs:
##
@@ -128,26 +148,20 @@ pub fn array_length_inner(args: &[ArrayRef]) ->
Result {
match &args[0].data_type()
chenkovsky commented on PR #16161:
URL: https://github.com/apache/datafusion/pull/16161#issuecomment-2906804315
it's related to #12704
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
2010YOUY01 commented on PR #15700:
URL: https://github.com/apache/datafusion/pull/15700#issuecomment-2906805485
> @2010YOUY01 and @ding-young I wonder if you can review this PR again to
help @rluvaton get it merged?
>
> Specifically if it needs more tests perhaps you can help identify
Kontinuation commented on PR #1773:
URL:
https://github.com/apache/datafusion-comet/pull/1773#issuecomment-2906791709
I have rebased this branch onto the latest main. The [ubuntu-latest/java
17-spark-4.0/java](https://github.com/apache/datafusion-comet/actions/runs/15222051464/job/428191403
Dandandan merged PR #16159:
URL: https://github.com/apache/datafusion/pull/16159
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@data
onlyjackfrost opened a new pull request, #16181:
URL: https://github.com/apache/datafusion/pull/16181
## Which issue does this PR close?
- Closes #15887
## Rationale for this change
Add docs for better understanding how DataSource, FileSource, and
DataSourceExec are related
l1t1 commented on issue #1131:
URL:
https://github.com/apache/datafusion-python/issues/1131#issuecomment-2906767514
@kosiew thank you, I learned it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
alamb commented on PR #16155:
URL: https://github.com/apache/datafusion/pull/16155#issuecomment-2906767977
Thank you again for the review @xudong963 and @adriangb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
alamb merged PR #16154:
URL: https://github.com/apache/datafusion/pull/16154
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb merged PR #16155:
URL: https://github.com/apache/datafusion/pull/16155
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on PR #16154:
URL: https://github.com/apache/datafusion/pull/16154#issuecomment-2906767875
Thanks again @findepi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
alamb commented on PR #16145:
URL: https://github.com/apache/datafusion/pull/16145#issuecomment-2906775628
🚀 let's keep the code moving
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
alamb closed issue #16099: Migrate memory_pool/pool tests to insta
URL: https://github.com/apache/datafusion/issues/16099
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
alamb merged PR #16145:
URL: https://github.com/apache/datafusion/pull/16145
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
irenjj commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105792100
##
datafusion/optimizer/src/create_dependent_join.rs:
##
@@ -0,0 +1,163 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor li
logan-keede commented on code in PR #16174:
URL: https://github.com/apache/datafusion/pull/16174#discussion_r2105889448
##
datafusion/optimizer/src/eliminate_cross_join.rs:
##
@@ -351,6 +353,8 @@ fn find_inner_join(
join_type: JoinType::Inner,
join_constraint:
comphead commented on code in PR #16104:
URL: https://github.com/apache/datafusion/pull/16104#discussion_r2105889739
##
datafusion/core/src/macros.rs:
##
@@ -0,0 +1,66 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.
comphead opened a new pull request, #1788:
URL: https://github.com/apache/datafusion-comet/pull/1788
## Which issue does this PR close?
Closes #1781.
## Rationale for this change
## What changes are included in this PR?
## How are these chan
ctsk commented on PR #16153:
URL: https://github.com/apache/datafusion/pull/16153#issuecomment-2906567750
This optimization is neat and already covers the common case of joins on
primary keys. I think we can further optimize the join hash table - even for
cases where *some* keys might have
2010YOUY01 opened a new issue, #16176:
URL: https://github.com/apache/datafusion/issues/16176
### Is your feature request related to a problem or challenge?
It would be great to add an example under `datafusion-examples` to
illustrate the following:
1. Default Planning and Opti
2010YOUY01 opened a new issue, #16177:
URL: https://github.com/apache/datafusion/issues/16177
### Is your feature request related to a problem or challenge?
It would be great to include an example (under `datafusion-examples`) to
illustrate:
1. Show how to configure the DataFus
2010YOUY01 opened a new issue, #16178:
URL: https://github.com/apache/datafusion/issues/16178
### Is your feature request related to a problem or challenge?
DataFusion currently has around 100 configuration settings:
https://datafusion.apache.org/user-guide/configs.html — and the numb
2010YOUY01 commented on issue #16177:
URL: https://github.com/apache/datafusion/issues/16177#issuecomment-2906609382
I think it's something we can do to wrap up the GSoC project @ding-young
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
Dandandan commented on PR #16153:
URL: https://github.com/apache/datafusion/pull/16153#issuecomment-2906647265
> This optimization is neat and already covers the common case of joins on
primary keys. I think we can further optimize the join hash table - even for
cases where _some_ keys migh
Dandandan opened a new issue, #16179:
URL: https://github.com/apache/datafusion/issues/16179
### Is your feature request related to a problem or challenge?
Currently we save indices to the batch always as `u64` in the `HashTable`
and in the `next` `Vec`.
If we have less than `u32:M
UBarney commented on code in PR #15954:
URL: https://github.com/apache/datafusion/pull/15954#discussion_r2105771450
##
datafusion/physical-plan/src/aggregates/mod.rs:
##
@@ -733,13 +733,33 @@ impl AggregateExec {
&self.input_order_mode
}
-fn statistics_inner(
UBarney commented on code in PR #15954:
URL: https://github.com/apache/datafusion/pull/15954#discussion_r2105771450
##
datafusion/physical-plan/src/aggregates/mod.rs:
##
@@ -733,13 +733,33 @@ impl AggregateExec {
&self.input_order_mode
}
-fn statistics_inner(
83 matches
Mail list logo