2010YOUY01 commented on issue #16065:
URL: https://github.com/apache/datafusion/issues/16065#issuecomment-2888233712
Welcome aboard! We're excited to collaborate with you for this GSoC project
😄
Regarding the plan, I can see the following sub-tasks:
1. Stabilize external sort
Rachelint commented on PR #15591:
URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2888248445
OK... I guess it is due to the computation of `block_id` and `block_offset`.
I found a machine with similar cores and profiling.
Then I found in the old and so good int
Rachelint commented on PR #15591:
URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2888248441
OK... I guess it is due to the computation of `block_id` and `block_offset`.
I found a machine with similar cores and profiling.
Then I found in the old and so good int
Adez017 opened a new pull request, #16074:
URL: https://github.com/apache/datafusion/pull/16074
## Which issue does this PR close?
- Closes #15777
## Rationale for this change
Added SQl example for different functions in the window functions docs
#
Adez017 commented on PR #16074:
URL: https://github.com/apache/datafusion/pull/16074#issuecomment-2888498532
@alamb could you trigger the CI . also I think there might be need for
running `./dev/update_function_docs.sh` as I think it halting in my machine .
--
This is an automated messag
comphead commented on PR #16070:
URL: https://github.com/apache/datafusion/pull/16070#issuecomment-2888488039
this might be related to https://github.com/apache/datafusion/pull/16062
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
duongcongtoai commented on issue #16073:
URL: https://github.com/apache/datafusion/issues/16073#issuecomment-2888503454
Do you think this draft
[PR](https://github.com/apache/datafusion/pull/16016/files#diff-500ed5b40952dd2bdecdd297383a15a290ac6314ea4cc6162b160ad05d01)
can combine all t
andygrove opened a new pull request, #1747:
URL: https://github.com/apache/datafusion-comet/pull/1747
## Which issue does this PR close?
N/A
Follows on from https://github.com/apache/datafusion-comet/pull/1746
## Rationale for this change
Rather tha
andygrove merged PR #1262:
URL: https://github.com/apache/datafusion-ballista/pull/1262
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr.
andygrove commented on code in PR #1260:
URL:
https://github.com/apache/datafusion-ballista/pull/1260#discussion_r2094123218
##
ballista/scheduler/src/scheduler_server/query_stage_scheduler.rs:
##
@@ -376,9 +376,9 @@ mod tests {
)
.await?;
-let job_i
adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2094127261
##
datafusion/datasource-parquet/src/opener.rs:
##
@@ -111,19 +120,61 @@ impl FileOpener for ParquetOpener {
.create(projected_schema, Arc::clone(&se
andygrove opened a new pull request, #1746:
URL: https://github.com/apache/datafusion-comet/pull/1746
## Which issue does this PR close?
Follows on from https://github.com/apache/datafusion-comet/pull/1744
## Rationale for this change
TBD
## What ch
adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2094128404
##
datafusion/common/src/pruning.rs:
##
@@ -0,0 +1,490 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreement
adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2094128256
##
datafusion/common/src/pruning.rs:
##
@@ -0,0 +1,490 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreement
Dandandan commented on PR #15380:
URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2888419860
So it seems re
> I am confused why my benchmark for local Mac no regression for sort-tpch
Q3, but the generated benchmark for linux we can reproduce the regression.
I
irenjj opened a new issue, #16073:
URL: https://github.com/apache/datafusion/issues/16073
### Is your feature request related to a problem or challenge?
related to: #16015
```
It would be really nice to figure out how to combine these passes into one
unified set of decorrelation
timsaucer commented on PR #16053:
URL: https://github.com/apache/datafusion/pull/16053#issuecomment-2888458903
@kylebarron @paleolimbot I tested this latest push against the
`test_st_point` including the additional parts that were commented out. Do you
have other examples of problems you've
zhuqi-lucas commented on PR #15380:
URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2888423022
> > I am confused why my benchmark for local Mac no regression for sort-tpch
Q3, but the generated benchmark for linux we can reproduce the regression.
>
> It may be that t
paleolimbot commented on PR #15663:
URL: https://github.com/apache/datafusion/pull/15663#issuecomment-2888431041
Closing since this is no longer needed!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
paleolimbot closed pull request #15663: [WIP] Experiment with DataFusion
against Arrow with Extension DataType support
URL: https://github.com/apache/datafusion/pull/15663
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
codecov-commenter commented on PR #1746:
URL:
https://github.com/apache/datafusion-comet/pull/1746#issuecomment-2888432508
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1746?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
zhuqi-lucas commented on PR #15380:
URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2888436428
Thank you @Dandandan @alamb ,
Addressed it in latest PR, it should be no regression now.
--
This is an automated message from the Apache Git Service.
To respond to the messag
timsaucer commented on PR #15911:
URL: https://github.com/apache/datafusion/pull/15911#issuecomment-2888464798
> Do you think it is feasible to update the scalar, aggregate, and window
function APIs to use `FieldRef` instead of Field? That way we can avoid most
string copies.
Do you
codecov-commenter commented on PR #1747:
URL:
https://github.com/apache/datafusion-comet/pull/1747#issuecomment-2888529840
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1747?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
timsaucer merged PR #1126:
URL: https://github.com/apache/datafusion-python/pull/1126
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...
gabotechs commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094196073
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
Really nice to get this tested! maybe it's a bit overkill to add dedicated
.s
gabotechs commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094196073
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
Really nice to get this tested! maybe it's a bit overkill to add dedicated
.s
gabotechs commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094196073
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
Really nice to get this tested! maybe it's a bit overkill to add dedicated
.s
milenkovicm opened a new pull request, #1264:
URL: https://github.com/apache/datafusion-ballista/pull/1264
# Which issue does this PR close?
Closes #.
# Rationale for this change
looks like docker will only be released on rc candidate (`45.0.0-rc1`) but
not on full r
dependabot[bot] opened a new pull request, #1127:
URL: https://github.com/apache/datafusion-python/pull/1127
Bumps [object_store](https://github.com/apache/arrow-rs-object-store) from
0.12.0 to 0.12.1.
Changelog
Sourced from https://github.com/apache/arrow-rs-object-store/blob/main
dependabot[bot] opened a new pull request, #1128:
URL: https://github.com/apache/datafusion-python/pull/1128
Bumps [arrow](https://github.com/apache/arrow-rs) from 55.0.0 to 55.1.0.
Release notes
Sourced from https://github.com/apache/arrow-rs/releases";>arrow's
releases.
ar
dependabot[bot] commented on PR #1071:
URL:
https://github.com/apache/datafusion-python/pull/1071#issuecomment-2888561264
Superseded by #1127.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dependabot[bot] closed pull request #1071: build(deps): bump object_store from
0.11.2 to 0.12.0
URL: https://github.com/apache/datafusion-python/pull/1071
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
dependabot[bot] opened a new pull request, #1129:
URL: https://github.com/apache/datafusion-python/pull/1129
Bumps [pyo3-build-config](https://github.com/pyo3/pyo3) from 0.24.1 to
0.25.0.
Release notes
Sourced from https://github.com/pyo3/pyo3/releases";>pyo3-build-config's
releas
timsaucer merged PR #1112:
URL: https://github.com/apache/datafusion-python/pull/1112
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...
timsaucer merged PR #1121:
URL: https://github.com/apache/datafusion-python/pull/1121
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...
Dandandan commented on PR #15591:
URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2888271159
> OK... I guess it is due to the computation of `block_id` and
`block_offset`.
>
> I found a machine with similar inteal cores like the testing machine and
profiled.
>
Rachelint commented on PR #15591:
URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2888275782
> we could enforce blocksize being a power of two so we can avoid the
expensive div and mod operations?
Yes, I am trying to do it.
--
This is an automated message from the
kadai0308 commented on PR #16068:
URL: https://github.com/apache/datafusion/pull/16068#issuecomment-2888172345
> Thank you for this PR @kadai0308 -- very helpful
>
> I am not sure about the implications of `Box`ing all the variants -- I
worry it will just add additional overhead for p
hansott commented on code in PR #1705:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1705#discussion_r2094095237
##
src/tokenizer.rs:
##
@@ -1229,14 +1229,26 @@ impl<'a> Tokenizer<'a> {
// operators
'-' => {
c
vimko commented on code in PR #1705:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1705#discussion_r2094040996
##
src/tokenizer.rs:
##
@@ -1229,14 +1229,26 @@ impl<'a> Tokenizer<'a> {
// operators
'-' => {
cha
Adez017 commented on PR #15832:
URL: https://github.com/apache/datafusion/pull/15832#issuecomment-2888345095
@alamb cc: @xudong963 check it now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
ajita-asthana opened a new pull request, #16077:
URL: https://github.com/apache/datafusion/pull/16077
## Which issue does this PR close?
- Closes #15986
## Rationale for this change
## What changes are included in this PR?
## Are these chang
logan-keede commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094229214
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
> Very nice that this was shipped so fast, thanks!
I did not do anyth
logan-keede opened a new pull request, #16078:
URL: https://github.com/apache/datafusion/pull/16078
## Which issue does this PR close?
- Closes #.
## Rationale for this change
- closes #15408
- closes #13648
## What changes are included in this PR?
logan-keede commented on PR #16078:
URL: https://github.com/apache/datafusion/pull/16078#issuecomment-2888652078
`cargo semver-checks` is probably too heavy to run on every push but I have
kept it for testing purpose.
I would also like to know other contributor/committer`s opinion.
jsai28 closed issue #15483: [DISCUSS] Data quality framework using DataFusion
URL: https://github.com/apache/datafusion/issues/15483
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
ding-young opened a new pull request, #1748:
URL: https://github.com/apache/datafusion-comet/pull/1748
## Which issue does this PR close?
It seems like `tpcbench.py` in `datafusion-benchmarks` no longer take `name`
as argument. This pr removes `--name` in `benchmarking_macos.md`
adriangb commented on issue #14993:
URL: https://github.com/apache/datafusion/issues/14993#issuecomment-2888737240
Another example that this enables:
https://docs.pinot.apache.org/basics/indexing/timestamp-index
--
This is an automated message from the Apache Git Service.
To respond to th
jfahne opened a new pull request, #16082:
URL: https://github.com/apache/datafusion/pull/16082
## Which issue does this PR close?
- Closes #16017
## Rationale for this change
The dataframe `describe` method serves as a tidier way to produce standard
summ
jonathanc-n closed pull request #13252: feat: add RightMark Join
URL: https://github.com/apache/datafusion/pull/13252
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubsc
ding-young opened a new pull request, #16081:
URL: https://github.com/apache/datafusion/pull/16081
## Which issue does this PR close?
Part of #16065
## Rationale for this change
Currently, datafusion-cli does not provide an option to use
`TrackConsumersPool` as memory p
ajita-asthana commented on PR #16076:
URL: https://github.com/apache/datafusion/pull/16076#issuecomment-2888593287
> ## Which issue does this PR close?
>
>
> * Closes #16009
>
> ## Rationale for this change
> ## What changes are included in this PR?
> ## Are these chan
logan-keede commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094229214
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
> Very nice that this was shipped so fast, thanks!
I did not do anyth
paleolimbot commented on code in PR #16053:
URL: https://github.com/apache/datafusion/pull/16053#discussion_r2094213318
##
datafusion/physical-expr/src/expressions/literal.rs:
##
@@ -34,15 +36,37 @@ use datafusion_expr_common::interval_arithmetic::Interval;
use datafusion_expr_
ajita-asthana opened a new pull request, #16076:
URL: https://github.com/apache/datafusion/pull/16076
## Which issue does this PR close?
- Closes #.
## Rationale for this change
## What changes are included in this PR?
## Are these changes t
logan-keede commented on code in PR #16071:
URL: https://github.com/apache/datafusion/pull/16071#discussion_r2094229214
##
datafusion/sqllogictest/test_files/fixed_size_list.slt:
##
Review Comment:
> Very nice that this was shipped so fast, thanks!
I did not do anyth
jonathanc-n opened a new pull request, #13252:
URL: https://github.com/apache/datafusion/pull/13252
## Which issue does this PR close?
Closes #13138 .
## Rationale for this change
## What changes are included in this PR?
## Are these changes tes
github-actions[bot] commented on PR #14954:
URL: https://github.com/apache/datafusion/pull/14954#issuecomment-2888711873
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] closed pull request #14922: BUG: schema_force_view_type
configuration not working for CREATE EXTERNAL TABLE
URL: https://github.com/apache/datafusion/pull/14922
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
adriangb opened a new pull request, #16080:
URL: https://github.com/apache/datafusion/pull/16080
Working on https://github.com/apache/datafusion/pull/16014 I think I found
that we collect parquet statistics by default on `ListingTable` *despite the
fact that the config option defaults to fa
github-actions[bot] closed pull request #14409: disable coercison for unmatched
struct type
URL: https://github.com/apache/datafusion/pull/14409
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
adriangb commented on PR #16080:
URL: https://github.com/apache/datafusion/pull/16080#issuecomment-2888713683
cc @alamb am I missing something here ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
adriangb commented on PR #16080:
URL: https://github.com/apache/datafusion/pull/16080#issuecomment-2888713483
We could also flip the default of `ListingTableOptions` which IMO is
reasonable (it should match the default in `SessionConfig`) and since
https://github.com/apache/datafusion/pull/
github-actions[bot] commented on PR #14411:
URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2888711945
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] closed pull request #14590: Introducing mutation testing
URL: https://github.com/apache/datafusion/pull/14590
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
github-actions[bot] commented on PR #14180:
URL: https://github.com/apache/datafusion/pull/14180#issuecomment-2888711988
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] closed pull request #14710: [WIP] chore: Add detailed error
for sum::coerce_type
URL: https://github.com/apache/datafusion/pull/14710
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
adriangb commented on PR #14411:
URL: https://github.com/apache/datafusion/pull/14411#issuecomment-2888717383
I'm sad to see this go stale 😢 , sadly I also don't have the bandwith to
push it forward
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
github-actions[bot] commented on PR #14684:
URL: https://github.com/apache/datafusion/pull/14684#issuecomment-2888711920
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
Adez017 commented on PR #16074:
URL: https://github.com/apache/datafusion/pull/16074#issuecomment-2888720939
> To get the examples updated in md files it would be needed to update
Documentation builders
Thanks for suggestion mate I fired ` ./dev/update_function_docs.sh` but I
think i
adriangb commented on issue #3774:
URL: https://github.com/apache/datafusion/issues/3774#issuecomment-2888721001
I think this is now solved. There is a config option, that can be set in
datafusion-cli, to collect stats or not during planning. @Dandandan
@isidentical can we close the issue?
adriangb commented on PR #16080:
URL: https://github.com/apache/datafusion/pull/16080#issuecomment-2888721262
Looks like the config option was added in
https://github.com/apache/datafusion/pull/3846 and it's just never agreed with
`ListingTableOptions`
--
This is an automated message fro
adriangb commented on PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#issuecomment-2888722612
My plan for this PR now is to first resolve blockers. In particular:
- https://github.com/apache/datafusion/pull/16069
- https://github.com/apache/datafusion/pull/16080
- PR
adriangb commented on PR #16080:
URL: https://github.com/apache/datafusion/pull/16080#issuecomment-2888725508
cc @Dandandan since you added the config option originally in #3846
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
jonathanc-n opened a new pull request, #16083:
URL: https://github.com/apache/datafusion/pull/16083
## Which issue does this PR close?
- Closes #13138 .
## Rationale for this change
Revamp implementation of the previous stale implementation for RightMark
##
jonathanc-n commented on PR #16083:
URL: https://github.com/apache/datafusion/pull/16083#issuecomment-2888794567
cc @comphead
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
jonathanc-n commented on code in PR #16083:
URL: https://github.com/apache/datafusion/pull/16083#discussion_r2094393265
##
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##
@@ -1009,15 +1010,27 @@ fn join_left_and_right_batch(
right_side_ordered,
)?;
-
irenjj commented on issue #16073:
URL: https://github.com/apache/datafusion/issues/16073#issuecomment-201933
Cool! Thanks @duongcongtoai, maybe we can support lateral join after #16016
merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
79 matches
Mail list logo