chenkovsky closed pull request #16185: fix: Fix `EquivalenceClass` calculation
for Union queries
URL: https://github.com/apache/datafusion/pull/16185
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
CookiePieWw commented on issue #15976:
URL: https://github.com/apache/datafusion/issues/15976#issuecomment-2925153931
Thanks for your feedback! I found `ValueStatistics` has already have
[`max_is_exact`](https://docs.rs/parquet/latest/parquet/file/statistics/struct.ValueStatistics.html#metho
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2925152088
> This is exactly what I want to avoid -- there are too many execs and we
shouldn't have a structure where this is repeated everywhere in code.
Isn't that somewhat unavoidable
Dandandan commented on issue #16206:
URL: https://github.com/apache/datafusion/issues/16206#issuecomment-2924659989
So if we're to change `RecordBatch` to `Vec` in the hashjoin, I
think this is what's roughly needed:
* Change `batch` to `Vec` in `JoinLeftData`
* Initialize `JoinLef
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924675924
> @ozankabak
>
> > Pipeline-breaking operators creating cancellation issues is a universal
problem -- this is not an `AggregateExec` issue. Therefore, I firmly believe
tha
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924684413
Just as an FYI, I was reading through the tokio docs yesterday and found
https://docs.rs/tokio/latest/tokio/task/coop/index.html#cooperative-scheduling
which seems very similar to t
ding-young commented on PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#issuecomment-2924686568
Thank you for review. I’ve applied your suggestions, and I’ll run it a few
more times and check the output display once again before requesting a final
review. If there are any ot
Kontinuation commented on code in PR #1817:
URL: https://github.com/apache/datafusion-comet/pull/1817#discussion_r2117529719
##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadFromS3Suite.scala:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
ding-young commented on code in PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#discussion_r2117532342
##
benchmarks/src/clickbench.rs:
##
@@ -128,36 +128,70 @@ impl RunOpt {
let ctx = SessionContext::new_with_config_rt(config,
rt_builder.build_arc()?);
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924691085
> but it's kind of similar to the count-batches-and-yield solution.
Thank you @pepijnve , this is really helpful, so our choice seems like good
enough:
1. For performanc
Kontinuation commented on PR #1817:
URL:
https://github.com/apache/datafusion-comet/pull/1817#issuecomment-2924690797
> I have another thought on this. Any number of users have developed custom
`AWSCredentialsProvider`s in Java but we would not have corresponding
implementations in Rust (t
ding-young commented on code in PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#discussion_r2117536617
##
benchmarks/compare.py:
##
@@ -156,8 +173,8 @@ def compare(
console.print(table)
# Calculate averages
-avg_baseline_time = total_baseline_time
ding-young commented on code in PR #16182:
URL: https://github.com/apache/datafusion/pull/16182#discussion_r2117541964
##
benchmarks/src/sort_tpch.rs:
##
@@ -294,7 +297,7 @@ impl RunOpt {
let mut stream = execute_stream(physical_plan.clone(),
state.task_ctx())?;
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924931929
> I am currently against merging this as is, unless we get stuck in our
search to find a proper way to solve this. Let's keep hacking at finding a good
solution -- we are making
pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924937197
> This is an interesting find, and I think it would be great if we could
somehow utilize this (or something similar to this).
I was just studying the implementation. It would
alamb commented on PR #16215:
URL: https://github.com/apache/datafusion/pull/16215#issuecomment-2925498438
Thanks @comphead and @2010YOUY01
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
kazantsev-maksim opened a new pull request, #1825:
URL: https://github.com/apache/datafusion-comet/pull/1825
## Which issue does this PR close?
Part of https://github.com/apache/datafusion-comet/issues/1819
Closes #.
See https://github.com/apache/datafusion-comet/issues/1
alamb commented on PR #16185:
URL: https://github.com/apache/datafusion/pull/16185#issuecomment-2925525241
- This PR might also be superceded by
https://github.com/apache/datafusion/pull/16217
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
alamb commented on code in PR #15871:
URL: https://github.com/apache/datafusion/pull/15871#discussion_r2118139282
##
datafusion/functions-aggregate/src/count.rs:
##
@@ -179,6 +179,107 @@ impl Count {
}
}
}
+fn get_primitive_type_accumulator(data_type: &DataType) -
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924752646
> https://docs.rs/tokio/latest/tokio/task/coop/fn.consume_budget.html
This is an interesting find, and I think it would be great if we could
somehow utilize this (or somethin
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2924751830
I am currently against merging this as is, unless we get stuck in our search
to find a proper way to solve this. Let's keep hacking at finding a good
solution -- we are making prog
hendrikmakait commented on code in PR #16209:
URL: https://github.com/apache/datafusion/pull/16209#discussion_r2118214706
##
datafusion/expr/src/expr.rs:
##
@@ -2500,7 +2500,7 @@ impl Display for SchemaDisplay<'_> {
}
}
Expr::BinaryExpr
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2925042496
> I just updated the PR also support sort exec, also added end to end
testing.
This is exactly what I want to avoid -- there are too many execs and we
shouldn't have a struc
blaginin commented on PR #15871:
URL: https://github.com/apache/datafusion/pull/15871#issuecomment-2925530843
thank you for the review! i really like your fuzzy testing idea - will push
soon (and respond to the new comments)
--
This is an automated message from the Apache Git Service.
To
comphead commented on code in PR #16216:
URL: https://github.com/apache/datafusion/pull/16216#discussion_r2118165176
##
docs/source/library-user-guide/upgrading.md:
##
@@ -21,6 +21,55 @@
## DataFusion `48.0.0`
+### `VARCHAR` SQL type changed to map to `Utf8View` Arrow type
Dandandan commented on code in PR #16083:
URL: https://github.com/apache/datafusion/pull/16083#discussion_r2118781458
##
datafusion/physical-plan/src/joins/utils.rs:
##
@@ -975,6 +996,12 @@ pub(crate) fn adjust_indices_by_join_type(
// the left_indices will not be u
26 matches
Mail list logo