ding-young commented on code in PR #16268:
URL: https://github.com/apache/datafusion/pull/16268#discussion_r2156147905
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -258,6 +259,8 @@ impl ExternalSorter {
batch_size: usize,
sort_spill_reservation_bytes: u
ding-young commented on code in PR #16268:
URL: https://github.com/apache/datafusion/pull/16268#discussion_r2156147684
##
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##
@@ -1324,6 +1326,8 @@ impl Stream for SortMergeJoinStream {
impl SortMergeJoinStream {
#[allo
Dandandan commented on PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#issuecomment-2986650188
Hm my earlier benchmarks didn't seem correct. not sure where the earlier run
came from π€
--
This is an automated message from the Apache Git Service.
To respond to the message,
Dandandan commented on code in PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#discussion_r2156139567
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -319,13 +341,87 @@ impl TopK {
/// (a > 2 OR (a = 2 AND b < 3))
/// ```
fn update_filter(&mut s
Dandandan commented on code in PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#discussion_r2156133514
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -319,13 +341,87 @@ impl TopK {
/// (a > 2 OR (a = 2 AND b < 3))
/// ```
fn update_filter(&mut s
Dandandan commented on code in PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#discussion_r2156106525
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -214,41 +238,39 @@ impl TopK {
let mut selected_rows = None;
-if let Some(filter) = self.
suibianwanwank commented on PR #16430:
URL: https://github.com/apache/datafusion/pull/16430#issuecomment-2986591856
> > > @andygrove how can we test this with Comet? Can I just pin to a
datafusion version?
> >
> >
> > Yes, assuming that there are no breaking API changes in DataFus
zhuqi-lucas commented on PR #16398:
URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2986468374
Yeah, the clickbench benchmark shows a little slower, it seems can be
reproduced, about total time 1000ms slower.
> hmm there seems to be some regressions there...
UBarney commented on PR #16443:
URL: https://github.com/apache/datafusion/pull/16443#issuecomment-2986400295
# benchmark
I use this
[script](https://gist.github.com/UBarney/9dcbf304e65f061d3352b34abd0f0e05#file-sql_bench-py)
to do benchmark
| ID | SQL | join_base Time(s) | join_li
github-actions[bot] commented on PR #13527:
URL: https://github.com/apache/datafusion/pull/13527#issuecomment-2986358525
Thank you for your contribution. Unfortunately, this pull request is stale
because it has been open 60 days with no activity. Please remove the stale
label or comment or
github-actions[bot] closed pull request #15324: feat: implement
GroupsAccumulator for `count(DISTINCT)` aggr
URL: https://github.com/apache/datafusion/pull/15324
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
github-actions[bot] closed pull request #15392: Draft: Use take-in kernel in
repartitioning
URL: https://github.com/apache/datafusion/pull/15392
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
jonathanc-n commented on code in PR #16436:
URL: https://github.com/apache/datafusion/pull/16436#discussion_r2155875377
##
datafusion/physical-plan/src/joins/symmetric_hash_join.rs:
##
@@ -810,6 +810,21 @@ where
{
// Store the result in a tuple
let result = match (bui
jonathanc-n commented on PR #16450:
URL: https://github.com/apache/datafusion/pull/16450#issuecomment-2986314949
The upside is that it performs well when both tables are extremely small <
50 rows π
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
codecov-commenter commented on PR #1901:
URL:
https://github.com/apache/datafusion-comet/pull/1901#issuecomment-2986286369
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1901?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
andygrove closed pull request #1904: feat: Add support for native hash join
with BuildRight + LeftAnti
URL: https://github.com/apache/datafusion-comet/pull/1904
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
comphead commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2986186984
> @comphead The root cause is
[apache/datafusion#10583](https://github.com/apache/datafusion/issues/10583)
Got it so it is a HJ issue, I'll try to check DF issue
--
T
andygrove merged PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@
andygrove commented on PR #16430:
URL: https://github.com/apache/datafusion/pull/16430#issuecomment-2986182319
> > @andygrove how can we test this with Comet? Can I just pin to a
datafusion version?
>
> Yes, assuming that there are no breaking API changes in DataFusion since
48 ... I
andygrove closed pull request #1913: [ignore] test DataFusion PR: Fix constant
window for evaluate stateful
URL: https://github.com/apache/datafusion-comet/pull/1913
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
andygrove opened a new pull request, #1913:
URL: https://github.com/apache/datafusion-comet/pull/1913
## Which issue does this PR close?
N/A
## Rationale for this change
We would like to see if https://github.com/apache/datafusion/pull/16430
fixes issues
parthchandra commented on code in PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910#discussion_r2155783253
##
dev/diffs/3.5.6.diff:
##
@@ -1938,7 +1938,17 @@ index 8e88049f51e..d3c0737d52e 100644
import testImplicits._
// keep() should take effect o
andygrove commented on PR #16430:
URL: https://github.com/apache/datafusion/pull/16430#issuecomment-2986119176
> @andygrove how can we test this with Comet? Can I just pin to a datafusion
version?
Yes, assuming that there are no breaking API changes in DataFusion since 48
... I will
kazuyukitanimura commented on code in PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910#discussion_r2155756956
##
dev/diffs/3.5.6.diff:
##
@@ -1938,7 +1938,17 @@ index 8e88049f51e..d3c0737d52e 100644
import testImplicits._
// keep() should take effe
kazuyukitanimura commented on code in PR #1911:
URL: https://github.com/apache/datafusion-comet/pull/1911#discussion_r2155755872
##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala:
##
@@ -1946,6 +1946,52 @@ class ParquetReadV1Suite extends ParquetReadSuite w
andygrove commented on code in PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910#discussion_r2155751838
##
dev/diffs/3.5.6.diff:
##
@@ -1938,7 +1938,17 @@ index 8e88049f51e..d3c0737d52e 100644
import testImplicits._
// keep() should take effect on S
AdamGS commented on PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#issuecomment-2986090162
Our benchmarks show this change fixes the performance regression we saw -
https://github.com/vortex-data/vortex/pull/3567
--
This is an automated message from the Apache Git Service
andygrove commented on code in PR #1903:
URL: https://github.com/apache/datafusion-comet/pull/1903#discussion_r2155747372
##
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##
@@ -61,6 +61,39 @@ import org.apache.comet.shims.CometExprShim
* An utility object
parthchandra commented on code in PR #1911:
URL: https://github.com/apache/datafusion-comet/pull/1911#discussion_r2155736023
##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala:
##
@@ -1946,6 +1946,52 @@ class ParquetReadV1Suite extends ParquetReadSuite with
codecov-commenter commented on PR #1911:
URL:
https://github.com/apache/datafusion-comet/pull/1911#issuecomment-2986043661
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1911?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
codecov-commenter commented on PR #1912:
URL:
https://github.com/apache/datafusion-comet/pull/1912#issuecomment-2986069982
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1912?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
kazuyukitanimura commented on code in PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910#discussion_r2155727836
##
dev/diffs/3.5.6.diff:
##
@@ -1938,7 +1938,17 @@ index 8e88049f51e..d3c0737d52e 100644
import testImplicits._
// keep() should take effe
parthchandra commented on code in PR #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910#discussion_r2155726443
##
.github/workflows/spark_sql_test.yml:
##
@@ -114,6 +114,6 @@ jobs:
run: |
cd apache-spark
rm -rf /root/.m2/repository/or
andygrove commented on PR #1912:
URL:
https://github.com/apache/datafusion-comet/pull/1912#issuecomment-2986048912
@kazuyukitanimura This PR will not actually test iceberg-compat until it
includes the fix from https://github.com/apache/datafusion-comet/pull/1910
--
This is an automated m
andygrove commented on PR #1885:
URL:
https://github.com/apache/datafusion-comet/pull/1885#issuecomment-2986043502
The fix for the test failure is in
https://github.com/apache/datafusion-comet/pull/1910
--
This is an automated message from the Apache Git Service.
To respond to the messag
kazuyukitanimura commented on code in PR #1911:
URL: https://github.com/apache/datafusion-comet/pull/1911#discussion_r2155719528
##
spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala:
##
@@ -1946,6 +1946,52 @@ class ParquetReadV1Suite extends ParquetReadSuite w
kazuyukitanimura opened a new pull request, #1912:
URL: https://github.com/apache/datafusion-comet/pull/1912
## Which issue does this PR close?
## Rationale for this change
To trigger Spark 3.4.3 SQL tests for iceberg-compat on PRs
## What changes are included in this PR?
andygrove commented on PR #1910:
URL:
https://github.com/apache/datafusion-comet/pull/1910#issuecomment-2986017103
One test failure, as expected:
```
2025-06-18T22:31:07.6082754Z [info] - SPARK-17091: Convert IN predicate to
Parquet filter push-down *** FAILED *** (297 millisecond
blaginin commented on code in PR #15928:
URL: https://github.com/apache/datafusion/pull/15928#discussion_r2155683469
##
datafusion/functions/src/regex/regexpinstr.rs:
##
@@ -0,0 +1,804 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor lice
blaginin commented on code in PR #15928:
URL: https://github.com/apache/datafusion/pull/15928#discussion_r2155683469
##
datafusion/functions/src/regex/regexpinstr.rs:
##
@@ -0,0 +1,804 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor lice
adriangb commented on PR #16445:
URL: https://github.com/apache/datafusion/pull/16445#issuecomment-2985988638
> I think it makes sense to only filter on the shared hashmap and not
bothering with the min/max values - creating hashes and doing a single table
lookup is quite fast, so I think w
blaginin commented on code in PR #15928:
URL: https://github.com/apache/datafusion/pull/15928#discussion_r2155673833
##
datafusion/functions/src/regex/regexpcount.rs:
##
@@ -29,10 +30,10 @@ use datafusion_expr::{
use datafusion_macros::user_doc;
use itertools::izip;
use regex
parthchandra opened a new pull request, #1911:
URL: https://github.com/apache/datafusion-comet/pull/1911
## Which issue does this PR close?
Adds a new unit test. Also adds a method to generate a complex type parquet
file that can be used to test various complex type cases.
--
This
adriangb commented on PR #16371:
URL: https://github.com/apache/datafusion/pull/16371#issuecomment-2985997261
I'll try to review tomorrow.
I took a look the other day and my thought was that while it's complex code
that is a bit hard for me to fully wrap my head around it's well teste
parthchandra commented on code in PR #1892:
URL: https://github.com/apache/datafusion-comet/pull/1892#discussion_r2155694477
##
spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala:
##
@@ -232,6 +232,21 @@ class CometArrayExpressionSuite extends CometTestBase wi
milenkovicm commented on issue #1274:
URL:
https://github.com/apache/datafusion-ballista/issues/1274#issuecomment-2985986246
In short users should extend ballista to support object store they need. S3
is a bit special case.
You can find more details how to do that in the examples.
parthchandra commented on PR #1901:
URL:
https://github.com/apache/datafusion-comet/pull/1901#issuecomment-2985979040
> Thanks for the contribution, @SKY-ALIN! Could we add a test case with
timestamps as the join key?
The test should have the left side and the right side timestamps b
adriangb commented on PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#issuecomment-2985967718
Seems like a bug in my implementation right? I'd be surprised if the update
checks I added are that heavy compared to other work...
--
This is an automated message from the Apache
Dandandan commented on PR #16433:
URL: https://github.com/apache/datafusion/pull/16433#issuecomment-2985955904
It seems in some cases it's faster:
```
ββββ³ββ³βββ³ββββ
β Queryβ topk-dynamic-filter β topk-filters β
dfinninger opened a new issue, #1274:
URL: https://github.com/apache/datafusion-ballista/issues/1274
Hi, we're trying to make Ballista read parquet files in Google Cloud
Storage. It looks like support for GCS was added in 2023:
https://github.com/apache/datafusion-ballista/pull/805. However
jonathanc-n commented on PR #16434:
URL: https://github.com/apache/datafusion/pull/16434#issuecomment-2985914954
Those benchmarks make sense, just saves memory.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
Dandandan commented on PR #16445:
URL: https://github.com/apache/datafusion/pull/16445#issuecomment-2985881381
> > I think doing only the lookup is preferable above also computing /
checking the bounds, I think the latter might create more overhead
>
> My thought was that for some cas
jonathanc-n commented on code in PR #16450:
URL: https://github.com/apache/datafusion/pull/16450#discussion_r2155615159
##
datafusion/core/src/physical_planner.rs:
##
@@ -1009,95 +1012,99 @@ impl DefaultPhysicalPlanner {
let left_df_schema = left.schema();
Dandandan commented on code in PR #16445:
URL: https://github.com/apache/datafusion/pull/16445#discussion_r2155602331
##
datafusion/physical-plan/src/joins/hash_join.rs:
##
@@ -943,10 +978,71 @@ impl ExecutionPlan for HashJoinExec {
try_embed_projection(projection,
codecov-commenter commented on PR #1910:
URL:
https://github.com/apache/datafusion-comet/pull/1910#issuecomment-2985831551
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1910?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
alamb commented on PR #16434:
URL: https://github.com/apache/datafusion/pull/16434#issuecomment-2985835006
π€: Benchmark completed
Details
```
Comparing HEAD and support-u32-hashmap
Benchmark clickbench_extended.json
andygrove commented on code in PR #1888:
URL: https://github.com/apache/datafusion-comet/pull/1888#discussion_r2155523377
##
spark/src/main/scala/org/apache/comet/rules/RewriteJoin.scala:
##
@@ -65,9 +65,8 @@ object RewriteJoin extends JoinSelectionHelper {
def rewrite(plan:
andygrove opened a new issue, #1909:
URL: https://github.com/apache/datafusion-comet/issues/1909
### What is the problem the feature request solves?
We currently fall back to Spark for hash join with LeftAnti + BuildRight
because of correctness issues. We should file an issue in DataF
mbutrovich merged PR #1907:
URL: https://github.com/apache/datafusion-comet/pull/1907
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...
Dandandan commented on PR #16398:
URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2985797508
hmm there seems to be some regressions there...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
andygrove commented on issue #1909:
URL:
https://github.com/apache/datafusion-comet/issues/1909#issuecomment-2985709966
There is already an issue in DataFusion
https://github.com/apache/datafusion/issues/10583
--
This is an automated message from the Apache Git Service.
To respond to the
AdamGS commented on PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#issuecomment-2985792094
Got a similar test failure to #16448 (issue filed in #16452). I have to
conclude its personal at this point, I'll try and find some time to dig into it.
--
This is an automated mess
andygrove commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2985785274
@comphead The root cause is https://github.com/apache/datafusion/issues/10583
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
andygrove commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2985784145
> > @comphead we seem to have a correctness issue when enabling LeftAnti +
BuildRight:
> > ```
> > [info] +- == Initial Plan ==
> > [info] CometBroadcastHashJoi
andygrove opened a new pull request, #1910:
URL: https://github.com/apache/datafusion-comet/pull/1910
## Which issue does this PR close?
Closes #.
## Rationale for this change
## What changes are included in this PR?
## How are these changes
alamb commented on PR #16371:
URL: https://github.com/apache/datafusion/pull/16371#issuecomment-2985770479
Sorry I have seen this one but haven't found time to review it yet
cc @adriangb and @timsaucer
--
This is an automated message from the Apache Git Service.
To respond to the m
alamb commented on code in PR #16436:
URL: https://github.com/apache/datafusion/pull/16436#discussion_r211968
##
datafusion/physical-plan/src/joins/symmetric_hash_join.rs:
##
@@ -810,6 +810,21 @@ where
{
// Store the result in a tuple
let result = match (build_sid
AdamGS commented on PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#issuecomment-2985767325
Added a short upgrade note
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
alamb commented on code in PR #16449:
URL: https://github.com/apache/datafusion/pull/16449#discussion_r2155540756
##
datafusion/physical-plan/Cargo.toml:
##
@@ -36,7 +36,6 @@ workspace = true
[features]
force_hash_collisions = []
-bench = []
Review Comment:
π
###
alamb commented on PR #16434:
URL: https://github.com/apache/datafusion/pull/16434#issuecomment-2985750963
π€ `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubun
alamb commented on code in PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#discussion_r2155537108
##
datafusion/sqllogictest/test_files/parquet_statistics.slt:
##
@@ -59,18 +59,18 @@ query TT
EXPLAIN SELECT * FROM test_table WHERE column1 = 1;
physical_pl
AdamGS opened a new issue, #16452:
URL: https://github.com/apache/datafusion/issues/16452
### Describe the bug
Fuzzer failed during an unrelated change -
https://github.com/apache/datafusion/actions/runs/15741542523/job/44367876525?pr=16449.
Not sure how long GitHub retains log
alamb commented on PR #16449:
URL: https://github.com/apache/datafusion/pull/16449#issuecomment-2985730673
> The test failure is a fuzzer failure, is there a accepted way to open
tickets generated by fuzzing?
I'll restart the test. Maybe you can just create a ticket with a link to the
comphead commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2985725340
> @comphead we seem to have a correctness issue when enabling LeftAnti +
BuildRight:
>
> ```
> [info] +- == Initial Plan ==
> [info] CometBroadcastHashJoin [
alamb commented on PR #16430:
URL: https://github.com/apache/datafusion/pull/16430#issuecomment-2985661485
I tried making a reproducer but I could not reproduce the wrong results or
panic reported in @andygrove 's comment
https://github.com/apache/datafusion/issues/16308#issuecomment-294951
alamb commented on PR #16398:
URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2985718371
I took the liberty of merging up from main to resolve a logical conflict
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
andygrove merged PR #1888:
URL: https://github.com/apache/datafusion-comet/pull/1888
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@
andygrove commented on issue #1909:
URL:
https://github.com/apache/datafusion-comet/issues/1909#issuecomment-2985711731
This issue may be a duplicate of
https://github.com/apache/datafusion-comet/issues/457
--
This is an automated message from the Apache Git Service.
To respond to the me
andygrove commented on PR #1888:
URL:
https://github.com/apache/datafusion-comet/pull/1888#issuecomment-2985701990
> btw is it still an issue? I think LeftAnti with SMJ has been fixed a while
ago in DF
It looks like there are still issues. I see a correctness issue when trying
to en
alamb merged PR #16342:
URL: https://github.com/apache/datafusion/pull/16342
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb closed issue #16240: How to write csv file to disk from a empty dataframe?
URL: https://github.com/apache/datafusion/issues/16240
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
alamb commented on PR #16401:
URL: https://github.com/apache/datafusion/pull/16401#issuecomment-2985697551
Thanks again @epgif and @comphead
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
alamb merged PR #16401:
URL: https://github.com/apache/datafusion/pull/16401
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
andygrove commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2985695312
@comphead we seem to have a correctness issue when enabling LeftAnti +
BuildRight:
```
[info] +- == Initial Plan ==
[info] CometBroadcastHashJoin [c1#253],
alamb commented on PR #16430:
URL: https://github.com/apache/datafusion/pull/16430#issuecomment-2985662279
@andygrove how can we test this with Comet? Can I just pin to a datafusion
version?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
andygrove commented on PR #1904:
URL:
https://github.com/apache/datafusion-comet/pull/1904#issuecomment-2985629164
One test failure:
```
- SPARK-38132: Not IN subquery correctness checks *** FAILED ***
```
--
This is an automated message from the Apache Git Service.
To respon
alamb commented on issue #15513:
URL: https://github.com/apache/datafusion/issues/15513#issuecomment-2985614903
Now all we need to do is find time to write one
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
mbutrovich commented on issue #1906:
URL:
https://github.com/apache/datafusion-comet/issues/1906#issuecomment-2985599126
So the big challenge here seems to be mapping Comet to Spark's execution.
Each generated Comet plan samples from its input stream, which itself is only a
single partitio
AdamGS commented on PR #16449:
URL: https://github.com/apache/datafusion/pull/16449#issuecomment-2985590878
The test failure is a fuzzer failure, is there a accepted way to open
tickets generated by fuzzing?
--
This is an automated message from the Apache Git Service.
To respond to the me
alamb commented on code in PR #16401:
URL: https://github.com/apache/datafusion/pull/16401#discussion_r2155411540
##
datafusion/catalog/src/information_schema/tests.rs:
##
@@ -0,0 +1,88 @@
+use std::sync::Arc;
Review Comment:
The CI is failing because this file doesn't have
miclegr commented on code in PR #1154:
URL:
https://github.com/apache/datafusion-python/pull/1154#discussion_r2155423158
##
python/datafusion/context.py:
##
@@ -535,7 +535,7 @@ def register_listing_table(
self,
name: str,
path: str | pathlib.Path,
-
AdamGS commented on PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#issuecomment-2985571652
I'm getting a lot of sqllogictest failures, is there a reason to think there
something weird going on? I was somewhat open to the idea its all fine until I
ran into the last test in `
alamb commented on code in PR #16451:
URL: https://github.com/apache/datafusion/pull/16451#discussion_r2155406105
##
.github/workflows/rust.yml:
##
@@ -39,14 +39,6 @@ on:
workflow_dispatch:
jobs:
- # Check license header
Review Comment:
The other copy is here: The oth
alamb opened a new pull request, #16451:
URL: https://github.com/apache/datafusion/pull/16451
## Which issue does this PR close?
## Rationale for this change
- While working on https://github.com/apache/datafusion/pull/16401 I noticed
that the license header check ran t
theirix commented on PR #16325:
URL: https://github.com/apache/datafusion/pull/16325#issuecomment-2985522134
> According to PostgreSQL's reference:
https://wiki.postgresql.org/wiki/TABLESAMPLE_Implementation#SYSTEM_Option I
believe `SYSTEM` option is equivalent to keep the entire `RecordBat
andygrove opened a new issue, #1908:
URL: https://github.com/apache/datafusion-comet/issues/1908
### What is the problem the feature request solves?
When using the `auto` mode for choosing the best Parquet scan
implementation, we do not currently take the file source into account. Thi
jonathanc-n commented on PR #16450:
URL: https://github.com/apache/datafusion/pull/16450#issuecomment-2985477515
I will try to run a benchmark on a table with smaller rows and return the
result when finished.
--
This is an automated message from the Apache Git Service.
To respond to the m
jonathanc-n opened a new pull request, #16450:
URL: https://github.com/apache/datafusion/pull/16450
## Which issue does this PR close?
- Closes #.
## Rationale for this change
We want to support equijoins in `NestedLoopJoin` in the case where one of
the tables in the
AdamGS commented on PR #16447:
URL: https://github.com/apache/datafusion/pull/16447#issuecomment-2985425260
Definitely, I'll run [our
benchmarks](https://github.com/vortex-data/vortex/pull/3560) once I get all
tests passing here.
--
This is an automated message from the Apache Git Servic
alamb commented on issue #16158:
URL: https://github.com/apache/datafusion/issues/16158#issuecomment-2985421552
> Is there anything to this issue besides changing the default
`datafusion.execution.collect_statistics`, fixing any tests that rely on the
default value being `false` and and
1 - 100 of 207 matches
Mail list logo