rishvin commented on issue #1820:
URL:
https://github.com/apache/datafusion-comet/issues/1820#issuecomment-2953616585
Hi @andygrove, I tried the following approach and looks like there is some
discrepancy in the Datafusion's `SparkSha2` output with Spark.
**This is what I attempted**
kazantsev-maksim opened a new pull request, #1864:
URL: https://github.com/apache/datafusion-comet/pull/1864
## Which issue does this PR close?
Part of https://github.com/apache/datafusion-comet/issues/1819
## Rationale for this change
See https://github.com/apache/datafu
hsrahh commented on issue #16311:
URL: https://github.com/apache/datafusion/issues/16311#issuecomment-2953633219
I think this feature would be really useful. Right now, when I try to use
DESC t1; to see the table schema, it shows an error because DESC is not
supported. Some other SQL system
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2952017708
I merged the latest from main, this is good to go
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
irenjj commented on code in PR #16016:
URL: https://github.com/apache/datafusion/pull/16016#discussion_r2133648910
##
datafusion/optimizer/src/rewrite_dependent_join.rs:
##
@@ -0,0 +1,1901 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor
irenjj commented on code in PR #16016:
URL: https://github.com/apache/datafusion/pull/16016#discussion_r2133652030
##
datafusion/optimizer/src/rewrite_dependent_join.rs:
##
@@ -0,0 +1,1901 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor
ozankabak commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2952147349
@zhuqi-lucas, I wanted to make a few final finishing touches as we gave a
chance in case @alamb wants to take a final look. I changed the config
terminology from "frequency" to "pe
xudong963 closed issue #16277: Question about the `map_varchar_to_utf8view`
config
URL: https://github.com/apache/datafusion/issues/16277
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
alamb commented on PR #16170:
URL: https://github.com/apache/datafusion/pull/16170#issuecomment-2952297536
Thanks @andygrove and @timsaucer -- this plan looks good to me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
alamb commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952305234
π€ `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952308503
https://github.com/apache/datafusion/actions/runs/15506836203/job/43662848611?pr=16249
> Caused by:
process didn't exit successfully:
`/home/runner/work/datafusion/datafusi
alamb commented on PR #16217:
URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2952309229
π woohoo!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
Dandandan commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952353060
I added a PR for reverting the changes in arrow-rs
https://github.com/apache/arrow-rs/pull/7623 - probably something subtle with
one of the fast paths that isn't tested in arrow-rs
alamb commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952354569
> I added a PR for reverting the changes in arrow-rs
[apache/arrow-rs#7623](https://github.com/apache/arrow-rs/pull/7623) - probably
something subtle with one of the fast paths that is
alamb opened a new pull request, #16317:
URL: https://github.com/apache/datafusion/pull/16317
## Which issue does this PR close?
- Related to https://github.com/apache/datafusion/issues/15797
- Follow on to https://github.com/apache/datafusion/pull/16170
- This is an updated
pepijnve opened a new issue, #16318:
URL: https://github.com/apache/datafusion/issues/16318
### Is your feature request related to a problem or challenge?
When a query pipeline contains one or more pipeline blockers, the query will
spend an extended period of time in the blocking phas
pepijnve commented on issue #16318:
URL: https://github.com/apache/datafusion/issues/16318#issuecomment-2952372827
Creating a PR with a proposed fix
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2952376120
> > I will investigate that if we can remove some internal yield logic, such
as repartition? etc
>
> Good idea, I'm curious to see if you can. `RepartitionExec` is a little
zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2952375579
> @zhuqi-lucas, I wanted to make a few final finishing touches as we gave a
chance in case @alamb wants to take a final look. I changed the config
terminology from "frequency" to
pepijnve opened a new pull request, #16319:
URL: https://github.com/apache/datafusion/pull/16319
## Which issue does this PR close?
- Closes #16318.
- Relates to #16196 and/or #16301
## Rationale for this change
Yielding to the runtime in Tokio involves unwinding the c
alamb opened a new pull request, #16320:
URL: https://github.com/apache/datafusion/pull/16320
- Draft until https://github.com/apache/datafusion/pull/16317 is merged
## Which issue does this PR close?
- Follow on to https://github.com/apache/datafusion/pull/16317
## Ratio
alamb commented on PR #16317:
URL: https://github.com/apache/datafusion/pull/16317#issuecomment-2952383640
I also found we can further unify the metadata handling for Expr::Alias as
well, see
- https://github.com/apache/datafusion/pull/16320
--
This is an automated message from the Apa
alamb commented on code in PR #16317:
URL: https://github.com/apache/datafusion/pull/16317#discussion_r2133768464
##
datafusion/expr/src/expr_rewriter/mod.rs:
##
@@ -390,11 +390,7 @@ mod test {
} else {
utf8_val
alamb commented on PR #16170:
URL: https://github.com/apache/datafusion/pull/16170#issuecomment-2952385242
I made a PR to main here (no rush on review):
- https://github.com/apache/datafusion/pull/16317
--
This is an automated message from the Apache Git Service.
To respond to the messa
alamb commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2133774454
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(Aggregate
alamb commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952394526
π€ `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on code in PR #16320:
URL: https://github.com/apache/datafusion/pull/16320#discussion_r2133774218
##
datafusion/expr/src/expr.rs:
##
@@ -3657,7 +3834,7 @@ mod test {
// If this test fails when you change `Expr`, please try
// `Box`ing the fields
alamb commented on PR #16320:
URL: https://github.com/apache/datafusion/pull/16320#issuecomment-2952395132
π€ `./gh_compare_branch_bench.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~
mbutrovich opened a new pull request, #1862:
URL: https://github.com/apache/datafusion-comet/pull/1862
## Which issue does this PR close?
Closes #458.
## Rationale for this change
## What changes are included in this PR?
## How are these cha
timsaucer commented on PR #1143:
URL:
https://github.com/apache/datafusion-python/pull/1143#issuecomment-2952430291
TODO (tsaucer): update deprecated interface in rust side, update python
bindings, mark deprecated in python as well
--
This is an automated message from the Apache Git Serv
alamb commented on PR #16320:
URL: https://github.com/apache/datafusion/pull/16320#issuecomment-2952450760
π€: Benchmark completed
Details
```
group alamb_field_metadata2
main
-
timsaucer opened a new pull request, #1143:
URL: https://github.com/apache/datafusion-python/pull/1143
# Which issue does this PR close?
Work in progress. Do not merge yet.
# Rationale for this change
# What changes are included in this PR?
# Are there any
pepijnve opened a new issue, #16321:
URL: https://github.com/apache/datafusion/issues/16321
### Describe the bug
When running a query like `select a from annotated_data_infinite2 order by b
desc limit 10`, a `SortPreservingMergeStream` is created that merge sorts the
presorted partit
pepijnve commented on issue #16321:
URL: https://github.com/apache/datafusion/issues/16321#issuecomment-2952504834
Investigating, but help appreciated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
pepijnve commented on issue #16321:
URL: https://github.com/apache/datafusion/issues/16321#issuecomment-2952514406
A closer look shows I might be completely mistaken. Will close if irrelevant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
timsaucer commented on code in PR #16317:
URL: https://github.com/apache/datafusion/pull/16317#discussion_r2133859411
##
datafusion/expr/src/expr.rs:
##
@@ -413,6 +413,162 @@ impl<'a> TreeNodeContainer<'a, Self> for Expr {
}
}
+/// Literal metadata
+///
+/// Stores metad
pepijnve opened a new pull request, #16322:
URL: https://github.com/apache/datafusion/pull/16322
## Which issue does this PR close?
- Closes #16321.
## Rationale for this change
`SortPreservingMergeStream` works in two phases. It first waits for each
input stream to be r
andygrove closed pull request #1853: chore: Upgrade to DataFusion 48.0.0-rc2
URL: https://github.com/apache/datafusion-comet/pull/1853
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
andygrove closed pull request #1855: [ignore] Debug regression in 48.0.0-rc2
URL: https://github.com/apache/datafusion-comet/pull/1855
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific com
andygrove opened a new pull request, #1863:
URL: https://github.com/apache/datafusion-comet/pull/1863
## Which issue does this PR close?
Closes #.
## Rationale for this change
## What changes are included in this PR?
## How are these changes
codecov-commenter commented on PR #1863:
URL:
https://github.com/apache/datafusion-comet/pull/1863#issuecomment-2952624559
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1863?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
alamb opened a new issue, #16323:
URL: https://github.com/apache/datafusion/issues/16323
The ownership of the main datafusion on crates.io
https://crates.io/crates/datafusion Should match the ownership of all
subcrates. I was trying to add @xudong963 as owner to the datafusion crates
so
andygrove commented on issue #16323:
URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2952664869
@alamb I have now sent invites for these crates
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
alamb closed issue #16323: Request to update crates.io ownership
URL: https://github.com/apache/datafusion/issues/16323
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsub
alamb commented on issue #16323:
URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2952666391
Thank you, I got them. π
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
andygrove closed issue #1852: Update or ignore tests in Spark SQL
WholeStageCodegenSuite
URL: https://github.com/apache/datafusion-comet/issues/1852
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
andygrove merged PR #1859:
URL: https://github.com/apache/datafusion-comet/pull/1859
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@
andygrove commented on PR #1858:
URL:
https://github.com/apache/datafusion-comet/pull/1858#issuecomment-2952704690
This PR is no longer needed now that the diff in
https://github.com/apache/datafusion-comet/pull/1736 is much smaller
--
This is an automated message from the Apache Git Ser
andygrove closed pull request #1858: fix: Update broadcast exchange logic to
support reused exchanges
URL: https://github.com/apache/datafusion-comet/pull/1858
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
andygrove closed issue #1246: Invalid argument error: Invalid arithmetic
operation: Int32 - Int64
URL: https://github.com/apache/datafusion-comet/issues/1246
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
andygrove commented on issue #1246:
URL:
https://github.com/apache/datafusion-comet/issues/1246#issuecomment-2952705335
Fixed in https://github.com/apache/datafusion-comet/pull/1848
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
zhuqi-lucas commented on issue #16200:
URL: https://github.com/apache/datafusion/issues/16200#issuecomment-2952713037
Do some experiment in:
https://github.com/apache/arrow-rs/pull/7624
It looks like the result is not bad, run 2 times, need to check it again:
```rust
c
jkosh44 commented on issue #16285:
URL: https://github.com/apache/datafusion/issues/16285#issuecomment-2952715421
The arrow C++ substrait library also doesn't support Durations, but they
have a comment about using UDTs to support them:
https://github.com/apache/arrow/blob/1d169cc90f65be6ee0
comphead closed pull request #1236: Feat: Support `map`, `map_keys` &
`maps_values`
URL: https://github.com/apache/datafusion-comet/pull/1236
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
comphead commented on PR #1236:
URL:
https://github.com/apache/datafusion-comet/pull/1236#issuecomment-2952735576
Closing as those functions already implemented
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
jkosh44 commented on issue #16248:
URL: https://github.com/apache/datafusion/issues/16248#issuecomment-2952747673
Most of the cast errors have the same cause, they are trying to cast a type
from Arrow that doesn't exist in substrait.
- https://github.com/apache/datafusion/issues/16275
codecov-commenter commented on PR #1862:
URL:
https://github.com/apache/datafusion-comet/pull/1862#issuecomment-2952754584
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1862?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
adriangb commented on issue #16200:
URL: https://github.com/apache/datafusion/issues/16200#issuecomment-2952754881
Looks like not a big difference to me?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
codecov-commenter commented on PR #1861:
URL:
https://github.com/apache/datafusion-comet/pull/1861#issuecomment-2952756479
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1861?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
Chen-Yuan-Lai opened a new pull request, #16324:
URL: https://github.com/apache/datafusion/pull/16324
## Which issue does this PR close?
- Closes #15791 .
## Rationale for this change
## What changes are included in this PR?
## Are these cha
Dandandan commented on PR #16249:
URL: https://github.com/apache/datafusion/pull/16249#issuecomment-2952801442
@alamb benchmark runs ok now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
andygrove commented on PR #1862:
URL:
https://github.com/apache/datafusion-comet/pull/1862#issuecomment-2952814148
I ran TPC-H benchmarks and saw shuffles with range partitioning run
natively. I did not see any difference in performance compared to the last set
of benchmarks I ran some tim
duongcongtoai commented on code in PR #16016:
URL: https://github.com/apache/datafusion/pull/16016#discussion_r2134056572
##
datafusion/optimizer/src/rewrite_dependent_join.rs:
##
@@ -0,0 +1,1901 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contr
kevinjqliu commented on issue #14608:
URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2952854828
This is great, thanks @clflushopt
I couldn't find a way to use datafusion to write multiple parquet files, but
i think this is a limitation with datafusion's `COPY` co
theirix opened a new pull request, #16325:
URL: https://github.com/apache/datafusion/pull/16325
## Which issue does this PR close?
- Closes #13563
## Rationale for this change
Explained in #13563 in detail with known syntax examples.
Thanks to [changes to
sqlparser](h
duongcongtoai commented on code in PR #16016:
URL: https://github.com/apache/datafusion/pull/16016#discussion_r2134056643
##
datafusion/optimizer/src/rewrite_dependent_join.rs:
##
@@ -0,0 +1,1901 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contr
a-agmon commented on issue #16303:
URL: https://github.com/apache/datafusion/issues/16303#issuecomment-2952859549
@alamb - I'm less familiar with this area in datafusion but might be able
to give this a shot.
The idea is to add this as a table function right?
I can see that `ListingT
mbutrovich commented on PR #1862:
URL:
https://github.com/apache/datafusion-comet/pull/1862#issuecomment-2952887778
> I ran TPC-H benchmarks and saw shuffles with range partitioning run
natively. I did not see any difference in performance compared to the last set
of benchmarks I ran some
pepijnve commented on PR #16322:
URL: https://github.com/apache/datafusion/pull/16322#issuecomment-2953125280
A sort preserving merge specific test case started failing. Iβll dig deeper
to better understand whatβs going on.
--
This is an automated message from the Apache Git Service.
To r
Omega359 opened a new issue, #16326:
URL: https://github.com/apache/datafusion/issues/16326
### Is your feature request related to a problem or challenge?
`Expr::WindowFunction` prior to DF 48 accepted a `WindowFunction`, now
requires a `Box`
### Describe the solution you'd
drtconway opened a new issue, #16327:
URL: https://github.com/apache/datafusion/issues/16327
### Describe the bug
I have a UInt64 column containing a 64-bit hash which I want to convert to a
hex string. `to_hex` should work, but gives the error:
```
Error: Custom { kind: Ot
zhuqi-lucas commented on issue #16200:
URL: https://github.com/apache/datafusion/issues/16200#issuecomment-2953460233
> Looks like not a big difference to me?
Some queries has 30% peformance improvement, will try to mock this in real
datafusion benchmark.
```
arrow_reader_clickb
kosiew commented on code in PR #16305:
URL: https://github.com/apache/datafusion/pull/16305#discussion_r2134362206
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -2452,4 +2178,381 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn infer_preser
kosiew commented on code in PR #16305:
URL: https://github.com/apache/datafusion/pull/16305#discussion_r2134362818
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -2452,4 +2178,381 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn infer_preser
kosiew commented on code in PR #16305:
URL: https://github.com/apache/datafusion/pull/16305#discussion_r2134363121
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -2452,4 +2178,381 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn infer_preser
kosiew commented on code in PR #16305:
URL: https://github.com/apache/datafusion/pull/16305#discussion_r2134364184
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -2452,4 +2178,382 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn infer_preser
kosiew commented on code in PR #16305:
URL: https://github.com/apache/datafusion/pull/16305#discussion_r2134362206
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -2452,4 +2178,381 @@ mod tests {
Ok(())
}
+
+#[tokio::test]
+async fn infer_preser
77 matches
Mail list logo