berkaysynnada commented on code in PR #15473:
URL: https://github.com/apache/datafusion/pull/15473#discussion_r2030077255
##
datafusion/datasource/src/file_scan_config.rs:
##
@@ -858,6 +858,96 @@ impl FileScanConfig {
})
}
+/// Splits file groups into new gro
alamb commented on PR #15589:
URL: https://github.com/apache/datafusion/pull/15589#issuecomment-2781429711
I expect this to make a large performance difference when x is a string type
(as string comparisons are fairly expensive)
Thank you for this PR @ding-young and the great reviews
alamb commented on PR #15589:
URL: https://github.com/apache/datafusion/pull/15589#issuecomment-2781429451
We could also use a CASE
```sql
CASE x IS NOT NULL THEN true ELSE null END
```
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
alamb commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781430463
I see -- this is code simplification đ
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
getChan commented on PR #15604:
URL: https://github.com/apache/datafusion/pull/15604#issuecomment-2781414127
close. it isn't really redundant. see issue comments
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
getChan closed pull request #15604: FIX: Remove redundant repartition
URL: https://github.com/apache/datafusion/pull/15604
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
berkaysynnada commented on issue #15524:
URL: https://github.com/apache/datafusion/issues/15524#issuecomment-2781418213
> 1. just following the Duck and make a benchmark specific optimization (and
don't try to handle any other cases)
> 2. take the high road and say we aren't going to benc
alamb commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781420170
I wonder if there is any way to write some tests for this (perhaps via
`EXPLAIN` in .slt tests to demonstrate that the unecessary exec is removed)
--
This is an automated message fro
alamb commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781421263
đ¤ `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking) Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024
x86_
berkaysynnada commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781412638
> The round robin repartitioning is added to increase parallelism (by
increasing number of partitions). Hash repartitioning does also increase the
number of partitions, bu
alamb commented on PR #15597:
URL: https://github.com/apache/datafusion/pull/15597#issuecomment-2781418624
This PR has several CI failures so marking as a draft while they are
addressed.
(I do this to make it easier to see what PRs are waiting on review)
--
This is an automated messa
berkaysynnada commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781422765
> > I wonder if there is any way to write some tests for this (perhaps via
`EXPLAIN` in .slt tests to demonstrate that the unecessary exec is removed)
>
> I don't think a
berkaysynnada commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781422523
> I wonder if there is any way to write some tests for this (perhaps via
`EXPLAIN` in .slt tests to demonstrate that the unecessary exec is removed)
I don't think a redun
berkaysynnada commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2781418886
I'll take a look at this as well asap
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
zhuqi-lucas commented on issue #15524:
URL: https://github.com/apache/datafusion/issues/15524#issuecomment-2781423856
Thank you @alamb , @berkaysynnada for further step suggestion, i will create
a follow-up ticket to do the 2nd option if i completed option 1.
--
This is an automated messa
adriangb commented on code in PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#discussion_r2029972339
##
datafusion/physical-plan/src/execution_plan.rs:
##
@@ -467,6 +468,353 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
) -> Result>> {
alamb commented on code in PR #15578:
URL: https://github.com/apache/datafusion/pull/15578#discussion_r2030152111
##
datafusion/sql/tests/cases/diagnostic.rs:
##
@@ -136,7 +137,7 @@ fn test_table_not_found() -> Result<()> {
let query = "SELECT * FROM /*a*/personx/*a*/";
chenkovsky commented on PR #15600:
URL: https://github.com/apache/datafusion/pull/15600#issuecomment-2781426396
> @chenkovsky do you have any idea about the root cause of the problem? I
think this PR shouldn't close the issue until fixing/understanding the
underlying problem
@berkays
alamb commented on code in PR #15603:
URL: https://github.com/apache/datafusion/pull/15603#discussion_r2030153718
##
datafusion/physical-plan/src/stream.rs:
##
@@ -362,6 +362,8 @@ pin_project! {
#[pin]
stream: S,
+
+transform_schema: bool,
Review Com
alamb closed issue #15534: Remove `ParquetSource::pruning_predicate`
URL: https://github.com/apache/datafusion/issues/15534
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
alamb merged PR #15561:
URL: https://github.com/apache/datafusion/pull/15561
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on PR #15561:
URL: https://github.com/apache/datafusion/pull/15561#issuecomment-2781427143
Onwards towards topk pushdown
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
UBarney closed issue #15601: Redundant Repartition: `RoundRobinBatch` Followed
by `Hash` in Physical Plans
URL: https://github.com/apache/datafusion/issues/15601
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
UBarney commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781435121
Thanks for your explanation @Dandandan @berkaysynnada . I now understand
the benefit of adding roundrobin even followed by Hash
--
This is an automated message from the Apach
alamb commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781439879
đ¤: Benchmark completed
Details
```
Comparing HEAD and remove-hj-coalesce
Benchmark clickbench_extended.json
qstommyshu commented on PR #15578:
URL: https://github.com/apache/datafusion/pull/15578#issuecomment-2781439769
> Hi @alamb and @blaginin
>
> I found a way to migrate `roundtrip_statement_with_dialect()` now, just
want to confirm if you like to proceed with it.
>
> What I did b
getChan opened a new pull request, #15606:
URL: https://github.com/apache/datafusion/pull/15606
## Which issue does this PR close?
- Closes #.
## Rationale for this change
It is easy to understand because it means the number of `RepartitionExec`
input's output partit
rluvaton commented on issue #15323:
URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781454412
I have a working version locally and will create a PR soon, just one
problem, I don't think I can know the number of blocking threads tokio is
configured with.
this is i
andygrove commented on issue #15323:
URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781458966
> I have a working version locally and will create a PR soon, just one
problem, I don't think we can know the number of blocking threads tokio is
configured with.
>
> t
chenkovsky commented on code in PR #15603:
URL: https://github.com/apache/datafusion/pull/15603#discussion_r2030172787
##
datafusion/physical-plan/src/stream.rs:
##
@@ -362,6 +362,8 @@ pin_project! {
#[pin]
stream: S,
+
+transform_schema: bool,
Revie
adriangb commented on code in PR #15568:
URL: https://github.com/apache/datafusion/pull/15568#discussion_r2030171299
##
datafusion/physical-expr-common/src/physical_expr.rs:
##
@@ -283,6 +284,51 @@ pub trait PhysicalExpr: Send + Sync + Display + Debug +
DynEq + DynHash {
/
rluvaton commented on issue #15323:
URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781460121
>
> Comet currently creates a new tokio runtime per plan but there is a
proposal to move to a global tokio runtime (per executor) instead.
>
>
[apache/datafusion-com
shehabgamin commented on PR #15588:
URL: https://github.com/apache/datafusion/pull/15588#issuecomment-2781460484
> FYI @shehabgamin
>
>
>
> Do you have some time to review this PR
Yes! Will carve out some time on Monday.
Thanks @jayzhan211 !!
--
This is an aut
getChan commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781463692
FWI. tpch benchmark. main vs remove round-robin repartition
```sh
Benchmark tpch_sf1.json
ââââł
alamb commented on issue #15582:
URL: https://github.com/apache/datafusion/issues/15582#issuecomment-2781382025
> I would be happy to share / upstream any work I do on this if there is
interest.
Thanks @matthewmturner -- what I think would be really valuable is if you
could prov
alamb commented on issue #15513:
URL: https://github.com/apache/datafusion/issues/15513#issuecomment-2781382737
Thanks @aaryyya -- note we'll need to actually finish the work before we
can publish a blog
Of course, blog driven development does work pretty well -- it is basically
we
alamb commented on issue #7014:
URL: https://github.com/apache/datafusion/issues/7014#issuecomment-2781383363
Hi @aaryyya -- I think thanks to @tshauck and others with the library user
guide, this is mostly done now (so closing it)
- https://datafusion.apache.org/library-user-guide/
alamb closed issue #7014: Getting started guide for new users (who want to use
DataFusion in their project)
URL: https://github.com/apache/datafusion/issues/7014
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
getChan commented on code in PR #15604:
URL: https://github.com/apache/datafusion/pull/15604#discussion_r2030128874
##
datafusion/physical-plan/src/repartition/mod.rs:
##
@@ -510,7 +510,7 @@ impl DisplayAs for RepartitionExec {
writeln!(f, "partitioning_scheme={
alamb opened a new issue, #15605:
URL: https://github.com/apache/datafusion/issues/15605
Unrelated to the current fix, we should compare them using
normalized names to support
```sql
SELECT
t1.v1,
SUM(t1.v1) OVER W + 1
FROM
generate_series(1, 5) AS t1(v1
alamb commented on code in PR #15033:
URL: https://github.com/apache/datafusion/pull/15033#discussion_r2030129209
##
datafusion/sql/src/select.rs:
##
@@ -891,29 +892,42 @@ fn match_window_definitions(
named_windows: &[NamedWindowDefinition],
) -> Result<()> {
for proj
barsela1 opened a new pull request, #1798:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1798
support for parsing nested JOINs without parentheses
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
barsela1 closed pull request #1798: Support nested join without parentheses
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1798
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
getChan opened a new pull request, #15604:
URL: https://github.com/apache/datafusion/pull/15604
## Which issue does this PR close?
- Closes #15601
## Rationale for this change
- It seems like the purpose of `add_roundrobin` in `EnforceDistribution`
optimizer rule is
alamb commented on issue #15529:
URL: https://github.com/apache/datafusion/issues/15529#issuecomment-2781385443
@NGA-TRAN and @gabotechs can you please help review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
alamb commented on PR #15595:
URL: https://github.com/apache/datafusion/pull/15595#issuecomment-2781386141
Thank you @XiangpengHao -- I also find CI debugging very long and painful
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
barsela1 opened a new pull request, #1799:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1799
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
berkaysynnada commented on PR #15597:
URL: https://github.com/apache/datafusion/pull/15597#issuecomment-2781391914
@djellemah thank you for working on this! Can we also add some tests to not
break these features in the future? and there are some failures in CI
--
This is an automated mess
alamb merged PR #15587:
URL: https://github.com/apache/datafusion/pull/15587
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on code in PR #15595:
URL: https://github.com/apache/datafusion/pull/15595#discussion_r2030130653
##
.github/workflows/rust.yml:
##
@@ -385,24 +385,24 @@ jobs:
linux-wasm-pack:
name: build with wasm-pack
-runs-on: ubuntu-latest
-container:
-
alamb merged PR #15590:
URL: https://github.com/apache/datafusion/pull/15590
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on code in PR #15590:
URL: https://github.com/apache/datafusion/pull/15590#discussion_r2030131422
##
datafusion/physical-plan/src/joins/mod.rs:
##
@@ -39,6 +40,11 @@ mod join_hash_map;
#[cfg(test)]
pub mod test_utils;
+/// The on clause of the join, as vector
qstommyshu commented on PR #15578:
URL: https://github.com/apache/datafusion/pull/15578#issuecomment-2781397695
Hi @alamb and @blaginin
I found a way to migrate `roundtrip_statement_with_dialect()` now, just want
to confirm if you like to proceed with it. What I did below is to incor
berkaysynnada commented on PR #15594:
URL: https://github.com/apache/datafusion/pull/15594#issuecomment-2781398502
Hello @kumarlokesh. Thank you for working on this. I have 2
questions/concerns. Let's discuss on them a bit to get a future-proof design
1) There are also `runtime_env: A
berkaysynnada commented on code in PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#discussion_r2028923212
##
datafusion/physical-plan/src/joins/cross_join.rs:
##
@@ -189,19 +188,12 @@ impl CrossJoinExec {
/// Asynchronously collect the result of the left child
logan-keede opened a new pull request, #15602:
URL: https://github.com/apache/datafusion/pull/15602
## Which issue does this PR close?
- Closes #15443
## Rationale for this change
- doc is advising user to use a private function, that used to be public.
#
milenkovicm commented on PR #15600:
URL: https://github.com/apache/datafusion/pull/15600#issuecomment-2781350515
I'm not an expert, but I don't think this issue is due unbounded recursion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
berkaysynnada commented on PR #15589:
URL: https://github.com/apache/datafusion/pull/15589#issuecomment-2781400999
> About the performance, I'm not 100% sure whether this rule worth the
change, but I made this change because `IS NOT NULL OR NULL` was a bit better
than actual comparision.
Dandandan commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781405588
I don't think it is really redundant.
The rund robin repartitioning is added to increase parallelism (by
increasing number of partitions).
Hash repartitioning does a
qstommyshu commented on PR #15595:
URL: https://github.com/apache/datafusion/pull/15595#issuecomment-2781406092
This could be very helpful for implementing CI for
`datafusion-wasm-bindings` as well!
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
getChan commented on code in PR #15604:
URL: https://github.com/apache/datafusion/pull/15604#discussion_r2030142110
##
datafusion/physical-optimizer/src/enforce_distribution.rs:
##
@@ -1258,19 +1259,14 @@ pub fn ensure_distribution(
child = add_spm_on_top(ch
berkaysynnada commented on PR #15600:
URL: https://github.com/apache/datafusion/pull/15600#issuecomment-2781407718
@chenkovsky do you have any idea about the root cause of the problem? I
think this PR shouldn't close the issue until fixing/understanding the
underlying problem
--
This is
getChan commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781316139
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
ding-young commented on PR #15589:
URL: https://github.com/apache/datafusion/pull/15589#issuecomment-2781281225
@2010YOUY01 Instead of applying transformation on filter expression, I
adjusted the rule to transform x=x into `x IS NOT NULL OR NULL`. This preserves
the behavior where NULL = NU
getChan commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2781294932
I agree with the suggestion. It seems like the purpose of add_roundrobin in
ensure_distribution optimizer is only to increase parallelism. it can achieve
that with just add_hash
logan-keede commented on PR #15602:
URL: https://github.com/apache/datafusion/pull/15602#issuecomment-2781360799
cc @findepi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
chenkovsky opened a new pull request, #15603:
URL: https://github.com/apache/datafusion/pull/15603
## Which issue does this PR close?
- Closes #15394.
## Rationale for this change
schema from inner physical plan is returned.
## What changes are included in this PR?
chenkovsky commented on PR #15600:
URL: https://github.com/apache/datafusion/pull/15600#issuecomment-2781362304
> I'm not an expert, but I don't think this issue is due unbounded recursion
yes, it's not due to unbounded recursion.
--
This is an automated message from the Apache Git
alamb commented on code in PR #67:
URL: https://github.com/apache/datafusion-site/pull/67#discussion_r2030116177
##
content/blog/2025-04-10-fastest-tpch-generator.md:
##
@@ -0,0 +1,614 @@
+---
+layout: post
+title: tpchgen-rs Worldâs fastest open source TPC-H data generator, wri
kevinjqliu commented on code in PR #67:
URL: https://github.com/apache/datafusion-site/pull/67#discussion_r2030181076
##
content/blog/2025-04-10-fastest-tpch-generator.md:
##
@@ -0,0 +1,613 @@
+---
+layout: post
+title: tpchgen-rs Worldâs fastest open source TPC-H data generator
andygrove opened a new pull request, #1614:
URL: https://github.com/apache/datafusion-comet/pull/1614
## Which issue does this PR close?
Closes https://github.com/apache/datafusion-comet/issues/1590
Maybe helps with https://github.com/apache/datafusion-comet/issues/1523
milenkovicm opened a new pull request, #1230:
URL: https://github.com/apache/datafusion-ballista/pull/1230
# Which issue does this PR close?
Closes #1205 .
# Rationale for this change
# What changes are included in this PR?
# Are there any user-facing changes?
berkaysynnada commented on code in PR #15539:
URL: https://github.com/apache/datafusion/pull/15539#discussion_r2030265464
##
datafusion/datasource/src/file_groups.rs:
##
@@ -263,7 +264,21 @@ impl FileGroupPartitioner {
.flatten()
.chunk_by(|(partition_i
timsaucer commented on PR #65:
URL: https://github.com/apache/datafusion-site/pull/65#issuecomment-2781663438
I was just playing around with disabling `codehilite` and just allowing the
`highlight.js` tool to do the code formatting. That does give better python
rendering. It seems like the
rluvaton opened a new pull request, #15608:
URL: https://github.com/apache/datafusion/pull/15608
## Which issue does this PR close?
- Closes #15323.
## Rationale for this change
To be able to sort any amount spill files without getting over the tokio
blocking thr
qstommyshu commented on PR #15578:
URL: https://github.com/apache/datafusion/pull/15578#issuecomment-2781841733
Got it, Thanks @alamb for your thoughts.
I will update `roundtrip_statement_with_dialect()`. I will probably go with
the macro approach because the macro approach is essent
Adez017 commented on PR #66:
URL: https://github.com/apache/datafusion-site/pull/66#issuecomment-2782070224
any updates ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
UBarney commented on issue #15601:
URL: https://github.com/apache/datafusion/issues/15601#issuecomment-2782098214
> `tpch_sf1` And `tpch_sf10` by default already partition the input data, so
AFAIK the plans should not be any different (they don't introduce round-robin
repartition)
Th
Adez017 commented on issue #15611:
URL: https://github.com/apache/datafusion/issues/15611#issuecomment-2782129401
i know that there might be something left out form the merge request and I
want to correct it .
@alamb
--
This is an automated message from the Apache Git Service.
To resp
rluvaton commented on PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2782032011
Also, to have a fully working sort, you need to spill in
https://github.com/apache/datafusion/blob/362fcdfc7b9e00cb6126a0cbc41c9abb2637c563/datafusion/physical-plan/src/sorts/bui
Adez017 opened a new issue, #15611:
URL: https://github.com/apache/datafusion/issues/15611
### Describe the bug
i was reviewing the changes made in the docs for the examples in window
functions and notice that the example was not in a organised manner .

f
2010YOUY01 commented on PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#issuecomment-2781983752
I think this is a similar problem as
https://github.com/apache/datafusion/issues/14692, will check this out soon
--
This is an automated message from the Apache Git Service.
alamb commented on PR #67:
URL: https://github.com/apache/datafusion-site/pull/67#issuecomment-2781547765
Thanks @kevinjqliu
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
alamb commented on code in PR #67:
URL: https://github.com/apache/datafusion-site/pull/67#discussion_r2030224671
##
content/blog/2025-04-10-fastest-tpch-generator.md:
##
@@ -0,0 +1,613 @@
+---
+layout: post
+title: tpchgen-rs Worldâs fastest open source TPC-H data generator, wri
codecov-commenter commented on PR #1614:
URL:
https://github.com/apache/datafusion-comet/pull/1614#issuecomment-2781519025
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1614?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
andygrove commented on PR #1614:
URL:
https://github.com/apache/datafusion-comet/pull/1614#issuecomment-2781532455
@Kontinuation @wForget could you review?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
alamb commented on PR #15413:
URL: https://github.com/apache/datafusion/pull/15413#issuecomment-2781566886
> sadly I'm working on my undergrad thesis project at this time and do not
have time to investigate this either đ˘ , might be back around mid april
Good luck with your project / t
alamb commented on PR #15570:
URL: https://github.com/apache/datafusion/pull/15570#issuecomment-2781567330
TLDR is benchmark results look good to me -- thanks @ctsk !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
rluvaton commented on issue #15323:
URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781673294
I created a draft PR with a solution, would appreciate your opinion:
- #15608
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
rluvaton commented on code in PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#discussion_r2030275003
##
datafusion/physical-plan/src/sorts/multi_level_sort_preserving_merge_stream.rs:
##
@@ -0,0 +1,244 @@
+// Licensed to the Apache Software Foundation (ASF) under
rluvaton commented on code in PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#discussion_r2030275339
##
datafusion/physical-plan/src/sorts/streaming_merge.rs:
##
@@ -133,6 +142,24 @@ impl<'a> StreamingMergeBuilder<'a> {
self
}
+pub fn with_spi
rluvaton commented on code in PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#discussion_r2030274817
##
datafusion/physical-plan/src/sorts/multi_level_sort_preserving_merge_stream.rs:
##
@@ -0,0 +1,244 @@
+// Licensed to the Apache Software Foundation (ASF) under
rluvaton commented on code in PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#discussion_r2030275339
##
datafusion/physical-plan/src/sorts/streaming_merge.rs:
##
@@ -133,6 +142,24 @@ impl<'a> StreamingMergeBuilder<'a> {
self
}
+pub fn with_spi
rluvaton commented on code in PR #15608:
URL: https://github.com/apache/datafusion/pull/15608#discussion_r2030275648
##
datafusion/physical-plan/src/sorts/streaming_merge.rs:
##
@@ -143,8 +170,27 @@ impl<'a> StreamingMergeBuilder<'a> {
fetch,
expression
friendlymatthew opened a new pull request, #15609:
URL: https://github.com/apache/datafusion/pull/15609
## Which issue does this PR close?
- Closes #14638
## Rationale for this change
https://github.com/apache/arrow-rs/pull/7141 enables casting from date to
time zone-aw
rluvaton commented on PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2782018619
BTW, row_hash uses the sort preserving merge stream as well and has similar
problem, I think this should be a solution outside the sort exec
--
This is an automated message from t
rluvaton commented on code in PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#discussion_r2030454805
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -535,56 +457,262 @@ impl ExternalSorter {
// reserved again for the next spill.
self.merge_
1 - 100 of 117 matches
Mail list logo