corwinjoy opened a new pull request, #16219:
URL: https://github.com/apache/datafusion/pull/16219
Draft PR to add encryption to datafusion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
comphead commented on code in PR #16083:
URL: https://github.com/apache/datafusion/pull/16083#discussion_r2116905907
##
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##
@@ -221,9 +221,11 @@ impl SortMergeJoinExec {
// When output schema contains only the right
Copilot commented on code in PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#discussion_r2117256270
##
benchmarks/compare.py:
##
@@ -156,8 +173,8 @@ def compare(
console.print(table)
# Calculate averages
-avg_baseline_time = total_baseline_time /
2010YOUY01 merged PR #16215:
URL: https://github.com/apache/datafusion/pull/16215
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dat
2010YOUY01 commented on issue #16054:
URL: https://github.com/apache/datafusion/issues/16054#issuecomment-2924301139
> I've had a closer look, we simply wrap `- {foo}` in parentheses regardless
of whether that's strictly necessary. Following the same principle for
`BinaryExpr` won't look as
2010YOUY01 commented on code in PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#discussion_r2117240352
##
benchmarks/src/util/run.rs:
##
@@ -138,6 +144,13 @@ impl BenchmarkRun {
}
}
+/// Mark current query
+pub fn mark_failed(&mut self) {
l1t1 closed issue #1131: union all +aggregate function in the recursive cte
results an infinite loop
URL: https://github.com/apache/datafusion-python/issues/1131
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
corwinjoy closed pull request #16219: Parquet encryption
URL: https://github.com/apache/datafusion/pull/16219
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-
2010YOUY01 commented on PR #16210:
URL: https://github.com/apache/datafusion/pull/16210#issuecomment-2924177927
This feature is intended purely for performance, right? Not because there
are cases that can only be run with NLJ.
If that's the case, I suggest to do some benchmarks upfront to
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924371687
Seems like folks are on the way to finding a better solution -- let's see
where the discussion goes.
--
This is an automated message from the Apache Git Service.
To respond to th
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922057060
๐ค: Benchmark completed
Details
```
Comparing HEAD and issue_16193
Benchmark cancellation.json
โ
berkaysynnada commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922061847
Thanks @zhuqi-lucas. The problem is clearly visible here, and the solution
makes sense. It doesn't sacrifice performance as seen in the benchmarks, and
not introduce any comple
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922117491
> For example, FileStream could count how many batches it sends back-to-back
without yielding, and after a certain threshold, it yields. WDYT?
Perhaps DataSourceExec could wra
alamb commented on code in PR #16208:
URL: https://github.com/apache/datafusion/pull/16208#discussion_r2115668031
##
datafusion/physical-plan/src/filter.rs:
##
@@ -689,6 +702,46 @@ fn filter_and_project(
})
}
+impl FilterExecStream {
+/// Evaluates the predicate
berkaysynnada commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922064218
> Thanks @zhuqi-lucas. The problem is clearly visible here, and the solution
makes sense. It doesn't sacrifice performance as seen in the benchmarks, and
not introduce any comp
alamb closed pull request #14657: Consolidate feature flags into configuration
guide
URL: https://github.com/apache/datafusion/pull/14657
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
Vedin opened a new pull request, #1867:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1867
`GROUPING SETS` currently accepts only tuples. It can be single elements as
well.
At least, I can see this behaviour in Hive and Snowflake:
https://hive.apache.org/docs/latest/3015
ozankabak opened a new pull request, #16217:
URL: https://github.com/apache/datafusion/pull/16217
## Which issue does this PR close?
- Closes #13748.
## Rationale for this change
DF's theoretical approach to representing orderings, equivalences and
constants is sound, bu
alamb commented on PR #16142:
URL: https://github.com/apache/datafusion/pull/16142#issuecomment-2922372187
Thanks again @zhuqi-lucas and @xudong963
I have also made a PR to add a mention of this change to the upgrade guide:
- https://github.com/apache/datafusion/pull/16216
--
Th
alamb commented on issue #16206:
URL: https://github.com/apache/datafusion/issues/16206#issuecomment-2922383864
I have noticed something similar -- that the pattern of "take / concat"
results in potentially non trival overhead.
I have been recently working on some related code (the sa
berkaysynnada commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922380421
@pepijnve thanks for the example. However, the same example can also consume
the pending return generated by Aggregate, so the guarantee cannot be upheld in
our model.
I
alamb commented on issue #16202:
URL: https://github.com/apache/datafusion/issues/16202#issuecomment-2922385651
Thanks @kosiew -- here is a PR:
- https://github.com/apache/datafusion/pull/16214
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
alamb commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2922387217
๐ฎ
I will put this on my list to review asap
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
alamb commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2922389594
@wiedld @xudong963 and @suremarc perhaps you are also interested in this
PR given your past interest in the equivalence code. Also cc @chenkovsky
--
This is an automated message f
ozankabak commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2922392615
Thanks for the quick response @alamb -- I added some reviewing tips and
tricks to the PR body to help you guys out.
--
This is an automated message from the Apache Git Service.
T
alamb commented on code in PR #15980:
URL: https://github.com/apache/datafusion/pull/15980#discussion_r2115997415
##
datafusion/expr/src/logical_plan/plan.rs:
##
@@ -1494,6 +1494,14 @@ impl LogicalPlan {
let mut param_types: HashMap> =
HashMap::new();
self.a
alamb commented on issue #16149:
URL: https://github.com/apache/datafusion/issues/16149#issuecomment-2922493420
> Do some changes in arrow-rs clickbench benchmark:
Do I understand that the changes you report are due to simply rewriting the
parquet files to have a page index and be unc
alamb commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2922489274
> Move fast with the release while this gets reviews, which has the
advantage of making DF48 a "checkpoint" before this PR. The downside is that we
will need to do extensive testing bo
ozankabak commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2922481568
Let's definitely time the merge and DF48 release in a cooperative way. We
have two competing realities at play: (1) The PR attracts a lot of conflicts
and it is not easy to resolve
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922512365
๐ค `./gh_compare_branch_bench.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922513729
I kicked off some benchmarks for this PR -- thank you @hendrikmakait ๐
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
alamb commented on code in PR #16166:
URL: https://github.com/apache/datafusion/pull/16166#discussion_r2115541241
##
datafusion/common/src/config.rs:
##
@@ -1566,8 +1517,68 @@ impl TableOptions {
/// # Parameters
///
/// * `format`: The file format to use (e.g., C
alamb commented on PR #16166:
URL: https://github.com/apache/datafusion/pull/16166#issuecomment-2921854046
> > However, I didn't see any tests or examples that show a different
behavior after this PR.
> > In my initial quick read through it I would say it has made the APIs
harder to
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922512158
Pipeline-breaking operators creating cancellation issues is a universal
problem -- this is not an `AggregateExec` issue. Therefore, I firmly believe
that the solution has to be uni
alamb commented on code in PR #16216:
URL: https://github.com/apache/datafusion/pull/16216#discussion_r2116071996
##
docs/source/library-user-guide/upgrading.md:
##
@@ -21,6 +21,55 @@
## DataFusion `48.0.0`
+### `VARCHAR` SQL type changed to map to `Utf8View` Arrow type
+
+
codecov-commenter commented on PR #1818:
URL:
https://github.com/apache/datafusion-comet/pull/1818#issuecomment-2922612812
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1818?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
andygrove opened a new issue, #1819:
URL: https://github.com/apache/datafusion-comet/issues/1819
### What is the problem the feature request solves?
Many of Comet's Spark-compatible DataFusion expressions are currently
implemented as `PhysicalExpr`. We would like to update them to imp
andygrove commented on issue #15914:
URL: https://github.com/apache/datafusion/issues/15914#issuecomment-2922618245
@shehabgamin @alamb I created an epic in Comet for implementing our current
expressions as `ScalarUDFImpl` rather than `PhysicalExpr` as a first step
towards contributing them
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922625826
> Pipeline-breaking operators creating cancellation issues is a universal
problem -- this is not an AggregateExec issue.
@ozankabak It's indeed a universal problem, but since
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922659200
๐ค `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
Blizzara commented on issue #11201:
URL: https://github.com/apache/datafusion/issues/11201#issuecomment-2922682982
Just fwiw, we already use Comet's Cast in our use of DF, through a rewrite
rule:
```
impl FunctionRewrite for OurCastRewrite {
fn name(&self) -> &str {
"
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922689219
Thank you @pepijnve , i may got it now, it means:
# Per-Batch Overhead Comparison
## 1. Counter + compare
```rust
this.batches_processed += 1; // I
Dandandan commented on code in PR #16208:
URL: https://github.com/apache/datafusion/pull/16208#discussion_r2115388263
##
datafusion/physical-plan/src/filter.rs:
##
@@ -689,6 +702,46 @@ fn filter_and_project(
})
}
+impl FilterExecStream {
+/// Evaluates the predic
iffyio merged PR #1864:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1864
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2921910759
@zhuqi-lucas I took the liberty of moving the code again as I gave you a
bad suggestion yesterday.
I also updated the description as i think this PR now handles NoGrouping and
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2921914352
> @zhuqi-lucas I took the liberty of moving the code again as I gave you a
bad suggestion yesterday.
>
> I also updated the description as i think this PR now handles NoGro
zhuqi-lucas commented on code in PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#discussion_r2115587079
##
datafusion/common-runtime/src/common.rs:
##
@@ -98,6 +103,71 @@ impl Drop for SpawnedTask {
}
}
+/// Number of batches to yield before voluntarily r
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2921920586
๐ค `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2921921243
๐ค `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on code in PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#discussion_r2115561327
##
datafusion/common-runtime/src/common.rs:
##
@@ -98,6 +103,71 @@ impl Drop for SpawnedTask {
}
}
+/// Number of batches to yield before voluntarily returni
alamb merged PR #16189:
URL: https://github.com/apache/datafusion/pull/16189
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on issue #15314:
URL: https://github.com/apache/datafusion/issues/15314#issuecomment-2921951978
There is a new proposed fix in
- https://github.com/apache/datafusion/pull/16196
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
alamb commented on PR #16209:
URL: https://github.com/apache/datafusion/pull/16209#issuecomment-2921965832
Given how widely used this code is I suspect this change will be quite
invasive / require changing many tests
--
This is an automated message from the Apache Git Service.
To resp
alamb closed issue #16140: Substrait: Reading a plan with aggregation with two
identical grouping exprs fails to produce correct output columns
URL: https://github.com/apache/datafusion/issues/16140
--
This is an automated message from the Apache Git Service.
To respond to the message, please
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922813081
Drop off for today, i will start the work tomorrow. Thank you @pepijnve for
good discussion, i think it deserve we have the cancellation solution.
--
This is an automated messa
dependabot[bot] opened a new pull request, #16212:
URL: https://github.com/apache/datafusion/pull/16212
Bumps
[testcontainers-modules](https://github.com/testcontainers/testcontainers-rs-modules-community)
from 0.12.0 to 0.12.1.
Release notes
Sourced from https://github.com/testco
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922659103
๐ค: Benchmark completed
Details
```
group main
reduce-expr-struct-size
-
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922755179
๐ค: Benchmark completed
Details
```
Comparing HEAD and reduce-expr-struct-size
Benchmark clickbench_extended.json
--
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922801898
> Would this require any changes to `SessionContext`? Cancellation is per
query, so I think you would want to add the `cancelled: AtomicBoolean`, `fn
cancel()` and `fn is_cancell
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922888436
I also agree we could make a query cancellation token work -- the potential
challenges of that approach are:
1. We have to find some way to trigger the cancellation token when the
s
parthchandra commented on issue #1812:
URL:
https://github.com/apache/datafusion-comet/issues/1812#issuecomment-2922892698
Spark's instructions for using sbt have mostly be applicable to Comet for me
https://spark.apache.org/docs/latest/building-spark.html#building-with-sbt
--
This is
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922714438
After my previous comment I started thinking about how our Java code handles
this kind of thing. Typically when we do a `Future#cancel(mayInterrupt: true)`
we have `Thread#isInterru
alamb opened a new pull request, #16216:
URL: https://github.com/apache/datafusion/pull/16216
## Which issue does this PR close?
- Related to https://github.com/apache/datafusion/issues/15096
- Follow on to https://github.com/apache/datafusion/pull/16142
## Rationale for
parthchandra commented on PR #1815:
URL:
https://github.com/apache/datafusion-comet/pull/1815#issuecomment-2922733351
Changing this to draft while I look into the CI failures
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922773141
Would this require any changes to `SessionContext`? Cancellation is per
query, so I think you would want to add the `cancelled: AtomicBoolean`, `fn
cancel()` and `fn is_cancelled()
codecov-commenter commented on PR #1821:
URL:
https://github.com/apache/datafusion-comet/pull/1821#issuecomment-2922995270
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1821?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922745280
> After my previous comment I started thinking about how our Java code
handles this kind of thing. Typically when we do a `Future#cancel(mayInterrupt:
true)` we have `Thread#isIn
mbutrovich commented on issue #1789:
URL:
https://github.com/apache/datafusion-comet/issues/1789#issuecomment-2922749343
The investigation seems to be diverging from `map_values` and seems more
focused on map value nullability, and in particular struct field nullability
and how that's enco
parthchandra commented on PR #1817:
URL:
https://github.com/apache/datafusion-comet/pull/1817#issuecomment-2922781217
I have another thought on this. Any number of users have developed custom
`AWSCredentialsProvider`s in Java but we would not have corresponding
implementations in Rust (the
comphead merged PR #16213:
URL: https://github.com/apache/datafusion/pull/16213
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
comphead commented on PR #16215:
URL: https://github.com/apache/datafusion/pull/16215#issuecomment-2922744744
hope downstream users didn't use it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
jonathanc-n commented on PR #16210:
URL: https://github.com/apache/datafusion/pull/16210#issuecomment-2922765656
I believe a follow up pull request can be made to actually have this be used
in the physical plan based on some statistics. @comphead Does this look fine
for now?
--
This is a
parthchandra commented on code in PR #1817:
URL: https://github.com/apache/datafusion-comet/pull/1817#discussion_r2116175913
##
native/Cargo.lock:
##
@@ -375,6 +375,353 @@ version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index";
checksum = "ace50bade
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922901439
@ozankabak
> Pipeline-breaking operators creating cancellation issues is a universal
problem -- this is not an `AggregateExec` issue. Therefore, I firmly believe
that the solut
alamb commented on PR #16215:
URL: https://github.com/apache/datafusion/pull/16215#issuecomment-2922905121
> hope downstream users didn't use it
If they need to write data to IPC files, I think either of the upstream
arrow-rs choices is better:
1. https://docs.rs/arrow-ipc/55.1.0/a
alamb commented on PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922908904
I'll leave this open for a few days to gather any remaining feedback
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
kazantsev-maksim commented on PR #1602:
URL:
https://github.com/apache/datafusion-comet/pull/1602#issuecomment-2922907576
Thank you for the feedback! @kazuyukitanimura @parthchandra
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
andygrove opened a new issue, #1820:
URL: https://github.com/apache/datafusion-comet/issues/1820
### What is the problem the feature request solves?
The `datafusion-spark` crate now provides a sha2 implementation
(https://github.com/apache/datafusion/pull/16168), so we should be able
andygrove opened a new pull request, #1821:
URL: https://github.com/apache/datafusion-comet/pull/1821
## Which issue does this PR close?
N/A
## Rationale for this change
As we add more expressions from `datafusion-spark` we would need to update
two identi
andygrove commented on code in PR #1821:
URL: https://github.com/apache/datafusion-comet/pull/1821#discussion_r2116291551
##
native/core/src/execution/planner.rs:
##
@@ -146,15 +146,7 @@ pub struct PhysicalPlanner {
impl Default for PhysicalPlanner {
fn default() -> Self
iffyio merged PR #1860:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1860
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr
alamb commented on code in PR #16213:
URL: https://github.com/apache/datafusion/pull/16213#discussion_r2115648865
##
datafusion/common/src/pruning.rs:
##
@@ -258,7 +261,16 @@ impl PruningStatistics for PartitionPruningStatistics {
}
/// Prune a set of containers represented
alamb commented on code in PR #16213:
URL: https://github.com/apache/datafusion/pull/16213#discussion_r2115648305
##
datafusion/common/src/pruning.rs:
##
@@ -137,19 +138,21 @@ pub trait PruningStatistics {
#[derive(Clone)]
pub struct PartitionPruningStatistics {
/// Value
alamb commented on PR #16189:
URL: https://github.com/apache/datafusion/pull/16189#issuecomment-2921966282
Thanks everyone!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
alamb commented on code in PR #16139:
URL: https://github.com/apache/datafusion/pull/16139#discussion_r2115649463
##
datafusion/common/src/pruning.rs:
##
@@ -122,3 +127,984 @@ pub trait PruningStatistics {
values: &HashSet,
) -> Option;
}
+
+/// Prune files based
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922028579
๐ค: Benchmark completed
Details
```
Comparing HEAD and issue_16193
Benchmark clickbench_extended.json
โโ
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922028695
๐ค `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on code in PR #16214:
URL: https://github.com/apache/datafusion/pull/16214#discussion_r2115790804
##
datafusion/core/tests/integration_tests/schema_adapter_integration_tests.rs:
##
@@ -258,3 +259,187 @@ fn test_schema_adapter_preservation() {
// Verify the s
alamb commented on code in PR #16148:
URL: https://github.com/apache/datafusion/pull/16148#discussion_r2115794350
##
datafusion/datasource/src/test_util.rs:
##
@@ -81,6 +83,8 @@ impl FileSource for MockSource {
fn file_type(&self) -> &str {
"mock"
}
+
+imp
alamb merged PR #16139:
URL: https://github.com/apache/datafusion/pull/16139
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb opened a new pull request, #16214:
URL: https://github.com/apache/datafusion/pull/16214
## Which issue does this PR close?
- Follow on for https://github.com/apache/datafusion/pull/16148
## Rationale for this change
During the review of https://github.co
Dandandan commented on code in PR #16208:
URL: https://github.com/apache/datafusion/pull/16208#discussion_r2115799584
##
datafusion/physical-plan/src/filter.rs:
##
@@ -689,6 +702,46 @@ fn filter_and_project(
})
}
+impl FilterExecStream {
+/// Evaluates the predic
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-293237
> yes, it sounds better. It will bring less friction and will be more
inclusive
Think about this a bit more, it's an interesting design question whose
responsibility it is to
alamb commented on PR #1864:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1864#issuecomment-2922335147
The code just keeps churning along in this repo. Thank you @iffyio and
@dmzmk
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922322640
I agree with @pepijnve that adding a yield to the producer is not clearly a
better choice -- I think the issue is for streams like AggregateExec that can
always make progress
S
alamb merged PR #16148:
URL: https://github.com/apache/datafusion/pull/16148
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922532031
> > Even if I couldn't get consensus on that direction, this version could
still be improved. For example, by inserting the YieldStream only when itโs
truly needed, such as in th
alamb commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2116011961
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(Aggregate
zhuqi-lucas commented on issue #16149:
URL: https://github.com/apache/datafusion/issues/16149#issuecomment-2922543426
> > Do some changes in arrow-rs clickbench benchmark:
>
> Do I understand that the changes you report are due to simply rewriting
the parquet files to have a page inde
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2922500062
> I think the issue is for streams like AggregateExec that can always make
progress. So while this PR wraps the input to the AggregateExecStream, I think
logically the yielding i
1 - 100 of 139 matches
Mail list logo