qstommyshu commented on PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#issuecomment-2764305700
> Hey again, thanks for working on this 🙏
>
> > can you merge main into this branch please? to remove extra diff
>
> Just to explain, the current PR diff is quite larg
jayzhan211 opened a new issue, #15491:
URL: https://github.com/apache/datafusion/issues/15491
### Is your feature request related to a problem or challenge?
We have error macro like `internal_err` and `exec_err` supported, but not
all of them are supported
Example
* External
jayzhan211 commented on PR #15457:
URL: https://github.com/apache/datafusion/pull/15457#issuecomment-2764347664
> > > count(*) actually doesnt depend on any column on input logically
> >
> >
> > count(*) need to know the row number of the column, and it doesn't make
sense to count
2010YOUY01 commented on code in PR #15469:
URL: https://github.com/apache/datafusion/pull/15469#discussion_r2020058583
##
datafusion/physical-plan/src/sorts/sort.rs:
##
@@ -416,21 +409,23 @@ impl ExternalSorter {
Some(self.spill_manager.create_in_progress_file("
2010YOUY01 merged PR #15302:
URL: https://github.com/apache/datafusion/pull/15302
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dat
blaginin commented on PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#issuecomment-2764289869
Hey again, thanks for working on this 🙏
> can you merge main into this branch please? to remove extra diff
Just to explain, the current PR diff is quite large because it
blaginin commented on PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#issuecomment-2764291364
> Hi, @blaginin I'm not sure what exactly do you mean by merge it to main? I
see there is no conflicts with base branch so it probably means GitHub can fast
forward it?
that'
acking-you commented on code in PR #15462:
URL: https://github.com/apache/datafusion/pull/15462#discussion_r2019944906
##
datafusion/physical-expr/src/expressions/binary.rs:
##
@@ -358,7 +358,50 @@ impl PhysicalExpr for BinaryExpr {
fn evaluate(&self, batch: &RecordBatch) -
qstommyshu commented on PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#issuecomment-2764295647
Got it, thanks for pointing that out. Just cleared up the diff tree.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
qstommyshu commented on code in PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#discussion_r2020033789
##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -1374,30 +1464,32 @@ async fn assert_read_filter_count(
Ok(())
}
-async fn assert_e
qstommyshu commented on code in PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#discussion_r2020033789
##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -1374,30 +1464,32 @@ async fn assert_read_filter_count(
Ok(())
}
-async fn assert_e
berkaysynnada commented on PR #15479:
URL: https://github.com/apache/datafusion/pull/15479#issuecomment-2764267974
I think you can generalize this logic by tracking the
`ExecutionPlanProperties::pipeline_behavior()` of operators in the plan.
--
This is an automated message from the Apach
ctsk closed pull request #15418: Clean up hash_join's ExecutionPlan::execute
URL: https://github.com/apache/datafusion/pull/15418
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
2010YOUY01 opened a new issue, #15492:
URL: https://github.com/apache/datafusion/issues/15492
### Is your feature request related to a problem or challenge?
When implementing complex logic, it's common to include assertions as sanity
checks to catch potential errors.
The Rust `asse
2010YOUY01 opened a new issue, #15493:
URL: https://github.com/apache/datafusion/issues/15493
### Describe the bug
I have a PR that didn't change the repartition code, but caused one
assertion failure inside `RepartitionExec`'s `execute()` method, during
`custom_datasource.rs` exampl
berkaysynnada commented on code in PR #15459:
URL: https://github.com/apache/datafusion/pull/15459#discussion_r2019995058
##
datafusion/catalog/src/memory/table.rs:
##
@@ -0,0 +1,377 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor licens
the0ninjas commented on issue #15422:
URL: https://github.com/apache/datafusion/issues/15422#issuecomment-2764416585
I'd love to work on this! @alamb Could you share the link to the blog
examples please?
--
This is an automated message from the Apache Git Service.
To respond to the messag
github-actions[bot] closed pull request #1614: Parse Postgres's LOCK TABLE
statement
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1614
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
2010YOUY01 commented on PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#issuecomment-2764361563
> Isn't it always better partitioning on this selection vectors in case of
hash-rep 🤔 What is the reason of keeping the old strategy ?
I think to support this selection vect
Standing-Man commented on PR #14955:
URL: https://github.com/apache/datafusion/pull/14955#issuecomment-2764329379
Thanks for your valuable contributions! I’ll continue working on fixing this
issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
Dandandan commented on PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#issuecomment-2763850029
> I'm working on HashAggregate
[goldmedal#3](https://github.com/goldmedal/datafusion/pull/3) based on this PR.
I found we shouldn't use only one config,
`prefer_hash_selection_vec
berkaysynnada commented on PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#issuecomment-2764263842
Isn't it always better partitioning on this selection vectors in case of
hash-rep 🤔 What is the reason of keeping the old strategy ?
--
This is an automated message from the
qstommyshu commented on PR #15480:
URL: https://github.com/apache/datafusion/pull/15480#issuecomment-2764290781
> Hey again, thanks for working on this 🙏
>
> > can you merge main into this branch please? to remove extra diff
>
> Just to explain, the current PR diff is quite larg
alamb commented on code in PR #15432:
URL: https://github.com/apache/datafusion/pull/15432#discussion_r2019764664
##
datafusion/core/src/datasource/listing/table.rs:
##
@@ -1181,6 +1175,92 @@ impl ListingTable {
}
}
+/// Processes a stream of partitioned files and return
LiaCastaneda commented on issue #15439:
URL: https://github.com/apache/datafusion/issues/15439#issuecomment-2763259160
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
ion-elgreco commented on issue #1064:
URL:
https://github.com/apache/datafusion-python/issues/1064#issuecomment-2763203660
Actually a duplicate of
https://github.com/apache/datafusion-python/issues/876
--
This is an automated message from the Apache Git Service.
To respond to the message
alan910127 commented on code in PR #15482:
URL: https://github.com/apache/datafusion/pull/15482#discussion_r2019692309
##
datafusion/sqllogictest/test_files/push_down_filter.slt:
##
@@ -230,19 +230,19 @@ logical_plan TableScan: t projection=[a],
full_filters=[t.a != Int32(100)]
ctsk commented on PR #15392:
URL: https://github.com/apache/datafusion/pull/15392#issuecomment-2754815322
@alamb This PR should be able to run benchmarks now. I've added overrides to
use the modified version of arrow in the PR and a lockfile to avoid chrono
issues. At least it can run tpch
acking-you commented on code in PR #15462:
URL: https://github.com/apache/datafusion/pull/15462#discussion_r2019945905
##
datafusion/physical-expr/src/expressions/binary.rs:
##
@@ -358,7 +358,50 @@ impl PhysicalExpr for BinaryExpr {
fn evaluate(&self, batch: &RecordBatch) -
romanb opened a new pull request, #1781:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1781
This PR adds support for parsing Databricks'
[TIMESTAMP_NTZ](https://docs.databricks.com/aws/en/sql/language-manual/data-types/timestamp-ntz-type)
data type.
--
This is an automated
comphead merged PR #15447:
URL: https://github.com/apache/datafusion/pull/15447
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@dataf
comphead closed issue #15403: Improve performance sort TPCH q3 with Utf8Vew (
Sort-preserving merging on a single `Utf8View` )
URL: https://github.com/apache/datafusion/issues/15403
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
acking-you commented on issue #15461:
URL: https://github.com/apache/datafusion/issues/15461#issuecomment-2763406114
> Trying with this data I am not getting any error The dataset is 14gb
>
> https://github.com/user-attachments/assets/b777bcc1-198c-425b-9433-8299016d232c";
/>
>
Omega359 opened a new pull request, #15489:
URL: https://github.com/apache/datafusion/pull/15489
## Which issue does this PR close?
- Closes #12650
## Rationale for this change
Expose union_by_name/union_by_name_distinct logical plan ops as DataFrame
operations
goldmedal commented on issue #15383:
URL: https://github.com/apache/datafusion/issues/15383#issuecomment-2763293892
@Dandandan
I have a draft https://github.com/goldmedal/datafusion/pull/3 based on
#15423 for `HashAggregate`. Could you check if it's heading in the right
direction?
jatin510 opened a new pull request, #15486:
URL: https://github.com/apache/datafusion/pull/15486
## Which issue does this PR close?
while working on this:
https://github.com/apache/datafusion/issues/14612
i found out that, when we enable `parse_float_as_decimal` as `t
goldmedal commented on PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#issuecomment-2763356410
I'm working on HashAggregate https://github.com/goldmedal/datafusion/pull/3
based on this PR.
I found we shouldn't use only one config,
`prefer_hash_selection_vector_partitionin
LiaCastaneda commented on issue #14799:
URL: https://github.com/apache/datafusion/issues/14799#issuecomment-2763312742
Thanks, this is actually an issue that happens specifically when using the
substrait consumer, I'm closing in favour of #15439
--
This is an automated message from the A
LiaCastaneda closed issue #14799: Duplicate Unqualified Field Name
URL: https://github.com/apache/datafusion/issues/14799
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
acking-you commented on issue #15461:
URL: https://github.com/apache/datafusion/issues/15461#issuecomment-2763365258
> I am using this as the dataset:
[https://datasets.clickhouse.com/hits_compatible/athena_partitioned/[hits_1.parquet](https://datasets.clickhouse.com/hits_compatible/athena_p
andygrove commented on code in PR #1573:
URL: https://github.com/apache/datafusion-comet/pull/1573#discussion_r2019828133
##
spark/src/main/spark-3.5/org/apache/spark/sql/comet/shims/ShimCometScanExec.scala:
##
@@ -55,15 +55,48 @@ trait ShimCometScanExec {
protected def isNee
psiayn commented on issue #15461:
URL: https://github.com/apache/datafusion/issues/15461#issuecomment-2763338588
Hi, I was trying to reproduce this issue but I get a different error.
I am using this as the dataset:
https://datasets.clickhouse.com/hits_compatible/athena_partitioned/[hi
adriangb commented on issue #15037:
URL: https://github.com/apache/datafusion/issues/15037#issuecomment-2763348634
wrt waiting for filter pushdown to be enabled by default, I think we're just
making our lives harder by coupling them, especially since we can already test
them together under
chenkovsky commented on PR #15457:
URL: https://github.com/apache/datafusion/pull/15457#issuecomment-2763209390
> > count(*) actually doesnt depend on any column on input logically
>
> count(*) need to know the row number of the column, and it doesn't make
sense to count all on "empty
Vabs-108 commented on issue #508:
URL:
https://github.com/apache/datafusion-python/issues/508#issuecomment-2763377457
is the issue still coming. Do anyone wants me to resolve the issue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
psiayn commented on issue #15461:
URL: https://github.com/apache/datafusion/issues/15461#issuecomment-2763392522
Trying with this data I am not getting any error The dataset is 14gb
https://github.com/user-attachments/assets/b777bcc1-198c-425b-9433-8299016d232c";
/>
https:/
psiayn commented on issue #15461:
URL: https://github.com/apache/datafusion/issues/15461#issuecomment-2763381508
Thank you, I will try with this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
qstommyshu commented on PR #15484:
URL: https://github.com/apache/datafusion/pull/15484#issuecomment-2763402403
Hi @alamb , @blaginin , @xudong963 ,
The code changes is now done, please review carefully as the code changes is
**LARGE**.
There some several things to note when I
andygrove commented on code in PR #1573:
URL: https://github.com/apache/datafusion-comet/pull/1573#discussion_r2019823038
##
spark/src/main/spark-3.5/org/apache/spark/sql/comet/shims/ShimCometScanExec.scala:
##
@@ -55,15 +55,48 @@ trait ShimCometScanExec {
protected def isNee
codecov-commenter commented on PR #1573:
URL:
https://github.com/apache/datafusion-comet/pull/1573#issuecomment-2763618618
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1573?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
zebsme commented on code in PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#discussion_r2019843158
##
datafusion/physical-plan/src/repartition/mod.rs:
##
@@ -316,6 +334,70 @@ impl BatchPartitioner {
Ok((partition, batch))
milenkovicm merged PR #1217:
URL: https://github.com/apache/datafusion-ballista/pull/1217
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubsc
zebsme commented on code in PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#discussion_r2019843158
##
datafusion/physical-plan/src/repartition/mod.rs:
##
@@ -316,6 +334,70 @@ impl BatchPartitioner {
Ok((partition, batch))
milenkovicm commented on PR #1217:
URL:
https://github.com/apache/datafusion-ballista/pull/1217#issuecomment-2763769459
Thanks @westhide
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
Dandandan commented on PR #15476:
URL: https://github.com/apache/datafusion/pull/15476#issuecomment-2763855444
Thanks @ctsk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
Dandandan merged PR #15476:
URL: https://github.com/apache/datafusion/pull/15476
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@data
zebsme commented on code in PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#discussion_r2019843158
##
datafusion/physical-plan/src/repartition/mod.rs:
##
@@ -316,6 +334,70 @@ impl BatchPartitioner {
Ok((partition, batch))
acking-you commented on code in PR #15462:
URL: https://github.com/apache/datafusion/pull/15462#discussion_r2019944906
##
datafusion/physical-expr/src/expressions/binary.rs:
##
@@ -358,7 +358,50 @@ impl PhysicalExpr for BinaryExpr {
fn evaluate(&self, batch: &RecordBatch) -
acking-you commented on PR #15462:
URL: https://github.com/apache/datafusion/pull/15462#issuecomment-2764093707
> Also, could you please add the new Q6 benchmark in a separate PR so I can
more easily run my benchmark scripts before/after your code change?
Okey,I got it.Do you mean tha
timsaucer commented on PR #15487:
URL: https://github.com/apache/datafusion/pull/15487#issuecomment-2764095858
@jayzhan211 Would you mind reviewing, specifically the part in
`datafusion/expr/src/type_coercion/functions.rs` since you were the prior
author?
--
This is an automated message
goldmedal commented on code in PR #15423:
URL: https://github.com/apache/datafusion/pull/15423#discussion_r2019935247
##
datafusion/sqllogictest/test_files/join.slt.part:
##
@@ -1389,6 +1389,112 @@ physical_plan
14)--FilterExec: y@1 = x@0
15)---
Omega359 commented on issue #15394:
URL: https://github.com/apache/datafusion/issues/15394#issuecomment-2764244381
I've been trying to find where to resolve this issue but my understanding of
the core of DF is currently too limited to uncover a solution. I've created a
test though that exhi
prowang01 opened a new pull request, #15490:
URL: https://github.com/apache/datafusion/pull/15490
## Which issue does this PR close?
Closes #14432
Relates to #14429
## Rationale for this change
This PR adds a user-facing diagnostic when a SQL function is called with
63 matches
Mail list logo