adriangb commented on PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#issuecomment-2798502900
Hey @jayzhan211 thank you for putting the work into trying to clarify this.
At this point I think it would be best to wait for #15566 or a PR that
replaces it to be merged so
marvelshan commented on issue #10451:
URL: https://github.com/apache/datafusion/issues/10451#issuecomment-2798494312
Before proceeding with implementation, I'd like to confirm my approach is
correct. I'm planning to create a new file named `options.md`dedicated to
documenting the available
kosiew commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2798494451
hi @Dandandan
I am getting failed tests with
```rust
#[test]
fn test_all_one() -> Result<()> {
// Helper function to run tests and repo
getChan commented on issue #15096:
URL: https://github.com/apache/datafusion/issues/15096#issuecomment-2798458529
Update subtasks list (maybe)
- [X] Support approx_distinct for Utf8View is done. by
https://github.com/apache/datafusion/pull/15200
- [ ] approx_percentile_cont should supp
chenkovsky commented on issue #15688:
URL: https://github.com/apache/datafusion/issues/15688#issuecomment-2798435085
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
jayzhan211 commented on PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#issuecomment-2798405570
https://github.com/apache/datafusion/pull/15568#discussion_r2038773841
# why the change is equivalent to your in the high level idea.
> 1. DynamicFilterPhysicalExpr ge
Kontinuation commented on issue #1639:
URL:
https://github.com/apache/datafusion-comet/issues/1639#issuecomment-2798421233
This should have been addressed by
https://github.com/apache/datafusion-comet/pull/1565 and
https://github.com/apache/datafusion-comet/pull/1573.
--
This is an auto
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040512866
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -159,35 +139,13 @@ impl DynamicFilterPhysicalExpr {
)
})?
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040512371
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -159,35 +139,13 @@ impl DynamicFilterPhysicalExpr {
)
})?
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040510160
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -36,16 +36,8 @@ use super::Column;
/// A dynamic [`PhysicalExpr`] that can be updated by
GitHub user westonpace added a comment to the discussion: Should ExecutionPlan
spawn tasks in `execute` function
This may just be a failure in my ability to read the manual. I now see this in
the
[docs](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#tymet
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040505924
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -335,22 +313,12 @@ mod test {
]));
// Each ParquetExec calls `with_new_c
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040504133
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -335,22 +313,12 @@ mod test {
]));
// Each ParquetExec calls `with_new_c
codecov-commenter commented on PR #1641:
URL:
https://github.com/apache/datafusion-comet/pull/1641#issuecomment-2798374572
##
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1641?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040471645
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -335,22 +313,12 @@ mod test {
]));
// Each ParquetExec calls `with_new_c
adriangb commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040469037
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -36,16 +36,8 @@ use super::Column;
/// A dynamic [`PhysicalExpr`] that can be updated by an
DerGut commented on issue #15675:
URL: https://github.com/apache/datafusion/issues/15675#issuecomment-2798317659
You are right!
With `v46.0.1`, the ExternalSorter estimates `35840 ` bytes for the first
record batch. Running with `sort_spill_reservation_bytes + record batch size ==
m
adriangb commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040467822
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -159,35 +139,13 @@ impl DynamicFilterPhysicalExpr {
)
})?
adriangb commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040466799
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -335,22 +313,12 @@ mod test {
]));
// Each ParquetExec calls `with_new_chi
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040452282
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -36,16 +36,8 @@ use super::Column;
/// A dynamic [`PhysicalExpr`] that can be updated by
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040451293
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -159,35 +139,13 @@ impl DynamicFilterPhysicalExpr {
)
})?
jayzhan211 commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2040450375
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -335,22 +313,12 @@ mod test {
]));
// Each ParquetExec calls `with_new_c
clflushopt commented on issue #14608:
URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2798250520
@alamb Yes once I address the couple of prioritized issues I have open for
`v1.0.0` the next step will be to work on the integration, I agree with having
table functions but
parthchandra commented on issue #1633:
URL:
https://github.com/apache/datafusion-comet/issues/1633#issuecomment-2798210998
To make it easier for the next person looking at this, the only difference
in the types is that the `value` field in the expected schema is `nullable:
false` while in
timsaucer commented on PR #15581:
URL: https://github.com/apache/datafusion/pull/15581#issuecomment-2798145855
I’ll resolve those clippy warnings next time I’m at my computer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub an
parthchandra commented on issue #1598:
URL:
https://github.com/apache/datafusion-comet/issues/1598#issuecomment-2798109113
> > Wow that's phenomenal! Are you able to share some (vague if necessary)
descriptions of your workload, cluster hardware, storage source, and what sort
of tuning (if
kevinjqliu commented on issue #1097:
URL:
https://github.com/apache/datafusion-python/issues/1097#issuecomment-2798091391
This issue will be a good reference. I'll probably also start a tracking
issue for iceberg integration
--
This is an automated message from the Apache Git Service.
To
timsaucer commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2798046180
Running CI on it now: https://github.com/apache/datafusion-python/pull/1104
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
timsaucer commented on code in PR #15581:
URL: https://github.com/apache/datafusion/pull/15581#discussion_r2040329018
##
datafusion/ffi/tests/ffi_udtf.rs:
##
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreemen
aharpervc opened a new pull request, #1811:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1811
Reference:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/print-transact-sql?view=sql-server-ver16
Making `message` a `Box` instead of an enum of (national) stri
parthchandra commented on issue #1639:
URL:
https://github.com/apache/datafusion-comet/issues/1639#issuecomment-2798047128
The third parameter to `PartitionedFileUtils.splitFiles` is a `Path` which
your call seems to be missing. The full stack trace might show where this is
being called fr
timsaucer opened a new pull request, #1104:
URL: https://github.com/apache/datafusion-python/pull/1104
This PR is just to test upstream datafusion 47 prior to cutting a release
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub a
Dandandan commented on PR #15690:
URL: https://github.com/apache/datafusion/pull/15690#issuecomment-2798027465
This is promising, need to fix the test and make sure the limit is respected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
milenkovicm opened a new pull request, #1237:
URL: https://github.com/apache/datafusion-ballista/pull/1237
# Which issue does this PR close?
Closes none.
# Rationale for this change
reduce log levels for few log statements, I would argue they do not need to
be printe
milenkovicm commented on PR #1236:
URL:
https://github.com/apache/datafusion-ballista/pull/1236#issuecomment-2797957413
thanks for patch @mmooyyii
there is a test to test object store access but it does not cover all cases
unfortunately, we definitely need to improve testing.
jus
timsaucer commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2797943836
> FYI [@timsaucer](https://github.com/timsaucer) we are getting ready to
release datafusion 47 -- shall we test with datafusion-python before doing so?
I've been using a
aharpervc commented on code in PR #1810:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1810#discussion_r2040246620
##
src/parser/mod.rs:
##
@@ -5265,18 +5271,71 @@ impl<'a> Parser<'a> {
trigger_object,
include_each,
condition
alamb closed issue #15126: Unique identifier for MemoryConsumer
URL: https://github.com/apache/datafusion/issues/15126
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubs
alamb commented on PR #15613:
URL: https://github.com/apache/datafusion/pull/15613#issuecomment-2797940153
This is great -- thanks again for the work and contribution @EmilyMatt
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
alamb merged PR #15613:
URL: https://github.com/apache/datafusion/pull/15613
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on code in PR #15581:
URL: https://github.com/apache/datafusion/pull/15581#discussion_r2040253122
##
datafusion/ffi/tests/ffi_udtf.rs:
##
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.
Dandandan opened a new pull request, #15690:
URL: https://github.com/apache/datafusion/pull/15690
## Which issue does this PR close?
- Closes #.
## Rationale for this change
Performance improvements for this case.
## What changes are included in
aharpervc opened a new pull request, #1810:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1810
Adjacent to: https://github.com/apache/datafusion-sqlparser-rs/pull/1808
with similar considerations
---
This PR introduces support for parsing `CREATE TRIGGER` for SQL
alamb commented on issue #14608:
URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2797923658
> I just read your blogpost today, and I am really happy to have a faster
generator. The post focussed on generating tpc-h to files, but I see you also
discussed something like th
alamb commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2797918175
FYI @timsaucer we are getting ready to release datafusion 47 -- shall we
test with datafusion-python before doing so?
--
This is an automated message from the Apache Git Se
alamb commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2797919224
I also tested the upgrade in delta.rs and it seems to have gone well for me
- https://github.com/delta-io/delta-rs/pull/3378
--
This is an automated message from the Apache Gi
alamb commented on PR #15648:
URL: https://github.com/apache/datafusion/pull/15648#issuecomment-2797909596
@kosiew -- I wonder if you saw this post from @Dandandan :
https://github.com/apache/datafusion/issues/15631#issuecomment-2796844672
It seems a simpler way to improve perfo
mbutrovich opened a new pull request, #1641:
URL: https://github.com/apache/datafusion-comet/pull/1641
## Which issue does this PR close?
Closes #1640. Partially address #1545 by reducing test failures.
## Rationale for this change
## What changes are incl
aharpervc commented on code in PR #1808:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2040196781
##
tests/sqlparser_mssql.rs:
##
@@ -187,6 +187,145 @@ fn parse_mssql_create_procedure() {
let _ = ms().verified_stmt("CREATE PROCEDURE [foo] AS
iffyio merged PR #1803:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1803
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr
aharpervc commented on code in PR #1808:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2040136168
##
src/parser/mod.rs:
##
@@ -5135,6 +5146,63 @@ impl<'a> Parser<'a> {
}))
}
+/// Parse `CREATE FUNCTION` for [SQL Server]
+//
alamb commented on PR #15561:
URL: https://github.com/apache/datafusion/pull/15561#issuecomment-2797869301
This definitely is an API change -- I hit it in the delta-rs upgrade:
- https://github.com/delta-io/delta-rs/pull/3378
I'll make a note to add it to the upgrade guide
--
Thi
Dandandan commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2797856807
Sorry to have caused so much discussion.
I'm totally in favor of keeping this open and have the function (without
`order by`) for now matching the expectation of "first
andygrove closed issue #15676: Regression in `last_value` functionality
URL: https://github.com/apache/datafusion/issues/15676
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
T
mbutrovich opened a new issue, #1640:
URL: https://github.com/apache/datafusion-comet/issues/1640
### Describe the bug
We have diffs for Spark 3.4.3, 3.5.4, 3.5.5, and 4.0.0-preview1 for running
Spark SQL tests. These were generated with different hash abbreviation lengths,
so genera
aharpervc commented on code in PR #1808:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2038224067
##
src/ast/mod.rs:
##
@@ -4050,6 +4051,16 @@ pub enum Statement {
arguments: Vec,
options: Vec,
},
+/// Return (SQL Server
andygrove commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2797840426
I'm ok with closing this issue since the behavior of first/last aggregates
without explicit ordering is not generally deterministic. We will figure out an
approach in Comet. T
andygrove commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2797833569
> Perhaps we can add order by for the tests?
Spark SQL doesn't seem to support an `ORDER BY` clause in this context.
--
This is an automated messag
timsaucer commented on PR #15646:
URL: https://github.com/apache/datafusion/pull/15646#issuecomment-2797745297
I need to take some time to review these comments and think more about it,
likely next week. Also I'm dropping a note for myself that the current
implementation isn't sufficient fo
andygrove commented on code in PR #1593:
URL: https://github.com/apache/datafusion-comet/pull/1593#discussion_r2040094303
##
common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java:
##
@@ -263,111 +272,129 @@ public void init() throws URISyntaxException,
IOExceptio
friendlymatthew commented on issue #15689:
URL: https://github.com/apache/datafusion/issues/15689#issuecomment-2797733418
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
alamb commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2797691953
The upgrade to arrow 55 is now ready for review too:
- https://github.com/apache/datafusion/pull/15466
--
This is an automated message from the Apache Git Service.
To respond
friendlymatthew commented on PR #15361:
URL: https://github.com/apache/datafusion/pull/15361#issuecomment-2797704916
_Note: I'm sorry about the super long write ups. I'm not trying to bike
shed._
I was thinking about
https://github.com/apache/datafusion/pull/15361/commits/f906af55df1
romanb commented on code in PR #1803:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1803#discussion_r2040073355
##
src/tokenizer.rs:
##
@@ -895,7 +895,7 @@ impl<'a> Tokenizer<'a> {
};
let mut location = state.location();
-while let Some
alamb opened a new pull request, #68:
URL: https://github.com/apache/datafusion-site/pull/68
Scale Factor 100 is 36GB not 3.6GB
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
alamb merged PR #15661:
URL: https://github.com/apache/datafusion/pull/15661
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusi
alamb commented on code in PR #15503:
URL: https://github.com/apache/datafusion/pull/15503#discussion_r2040065184
##
datafusion/physical-plan/src/statistics.rs:
##
@@ -0,0 +1,196 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license ag
alamb commented on PR #15661:
URL: https://github.com/apache/datafusion/pull/15661#issuecomment-2797689097
> It seems that the PR still has the issue that is mentioned here:
https://github.com/xudong963/arrow-datafusion/pull/5#discussion_r2034641672.
Yes I think you are right -- howev
alamb opened a new issue, #15689:
URL: https://github.com/apache/datafusion/issues/15689
### Describe the bug
As @xudong963 mentions in
-
https://github.com/xudong963/arrow-datafusion/pull/5#discussion_r2034641672.
And also brought up again in
- https://github.com/apac
Omega359 commented on PR #15361:
URL: https://github.com/apache/datafusion/pull/15361#issuecomment-2797640563
Thanks for the additional work on this @friendlymatthew ! I think this
approach is solid - the overhead for the casting is limited to only the cases
where the format string includes
mbutrovich commented on PR #15466:
URL: https://github.com/apache/datafusion/pull/15466#issuecomment-2797625880
> I thought it might be related to improved pre-fetching / fewer IOs due to
This should be easy to confirm with `dtruss`/`dtrace`/`bpftrace`. Let me see
if I find a moment.
alamb commented on PR #15466:
URL: https://github.com/apache/datafusion/pull/15466#issuecomment-2797605664
> Still seeing if this is just noise, but here are flame graphs for Q14 from
my machine if anyone else wants to stare at them:
My theory is that the improvement is due to @rluvat
alamb merged PR #68:
URL: https://github.com/apache/datafusion-site/pull/68
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusio
kazuyukitanimura commented on issue #15676:
URL: https://github.com/apache/datafusion/issues/15676#issuecomment-2797612697
> Rewrite our tests for LAST to stop comparing to Spark and implement some
other means to determine that the behavior is correct, and also document that
Comet is not co
alamb commented on PR #68:
URL: https://github.com/apache/datafusion-site/pull/68#issuecomment-2797615019
Thanks @timsaucer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
marvelshan commented on issue #10451:
URL: https://github.com/apache/datafusion/issues/10451#issuecomment-2797584011
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
adriangb commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2797543695
Thank you as well!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
ozankabak commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2797532385
All right -- we will submit a PR early next week and get it merged ASAP to
enable you to carry on. We will also keep on collaborating with you for
subsequent PRs as this functional
aharpervc commented on code in PR #1808:
URL:
https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2039938786
##
src/ast/ddl.rs:
##
@@ -2157,6 +2157,10 @@ impl fmt::Display for ClusteredBy {
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[c
jayzhan211 commented on PR #15568:
URL: https://github.com/apache/datafusion/pull/15568#issuecomment-2796973499
https://github.com/apache/datafusion/pull/15685
@adriangb, the `snapshot` is different but I think the overall idea should
be the same, while we avoid remapping each time we
leoyvens opened a new pull request, #15687:
URL: https://github.com/apache/datafusion/pull/15687
## Which issue does this PR close?
- Closes #15686.
## What changes are included in this PR?
Adds `parse_hex_as_fixed_size_binary` parser option.
## Are these changes t
aharpervc opened a new pull request, #1809:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1809
Reference:
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/sql-server-utilities-statements-go
Lots of conventional SQL Server tooling supports `GO`, so it seems r
nuno-faria opened a new issue, #15688:
URL: https://github.com/apache/datafusion/issues/15688
### Describe the bug
The unparse of Join operators is ignoring the projected columns, ending up
projecting everything.
Two conditions cause this to happen:
- the final projected col
andygrove commented on code in PR #1619:
URL: https://github.com/apache/datafusion-comet/pull/1619#discussion_r2039804022
##
native/core/src/parquet/parquet_exec.rs:
##
@@ -61,7 +61,12 @@ pub(crate) fn init_datasource_exec(
data_filters: Option>>,
session_timezone: &st
alamb commented on issue #15072:
URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2797397320
Thanks -- I plan to make a test PR for delta.rs later this afternoon and
will report back
--
This is an automated message from the Apache Git Service.
To respond to the message,
berkaysynnada commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2796969760
> I don't think dynamic vs. static is the right distinction to make here.
I did it since your examples were on dynamic filters. I just wanted to show
dynamic filters case
alamb commented on issue #15620:
URL: https://github.com/apache/datafusion/issues/15620#issuecomment-2797395565
It seems like a reasonable proposal to me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
ctsk opened a new issue, #15684:
URL: https://github.com/apache/datafusion/issues/15684
### Is your feature request related to a problem or challenge?
I think it would be useful if the benchmarks would consider environment
variables for the datafusion configuration. This would let dev
leoyvens opened a new issue, #15686:
URL: https://github.com/apache/datafusion/issues/15686
### Is your feature request related to a problem or challenge?
Say you have a column `byte_column` of type `FixedSizeBinary`. Doing `where
byte_column = x'deadbeef'` will fail, because the lite
berkaysynnada commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2796862617
> Could you help clarify when the FilterExec nodes get inserted? Maybe some
examples with DataSourceExecs that do not accept any filters would help.
You can look at the n
Dandandan commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2796844672
Btw as a simple concept, I tested this yesterday to reduce execution time of
short circuiting all false / all true cases by -25% compared to `true_count` /
`false_count`:
adriangb commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2797315179
> @adriangb perhaps we can work on creating a new PR (stacked on this one)
that hooks everything up for dynamic filter pushdown. That way we can have
things ready to go once we get
adriangb commented on code in PR #15685:
URL: https://github.com/apache/datafusion/pull/15685#discussion_r2039832734
##
datafusion/physical-expr/src/expressions/dynamic_filters.rs:
##
@@ -105,47 +97,44 @@ impl DynamicFilterPhysicalExpr {
inner: Arc,
) -> Self {
andygrove commented on code in PR #1619:
URL: https://github.com/apache/datafusion-comet/pull/1619#discussion_r2039766214
##
native/core/src/parquet/parquet_exec.rs:
##
@@ -61,7 +61,12 @@ pub(crate) fn init_datasource_exec(
data_filters: Option>>,
session_timezone: &st
andygrove commented on code in PR #1619:
URL: https://github.com/apache/datafusion-comet/pull/1619#discussion_r2039766851
##
native/core/src/parquet/parquet_exec.rs:
##
@@ -61,7 +61,12 @@ pub(crate) fn init_datasource_exec(
data_filters: Option>>,
session_timezone: &st
andygrove commented on PR #1619:
URL:
https://github.com/apache/datafusion-comet/pull/1619#issuecomment-2797232477
> I addressed the feedback but I no longer see a performance improvement
with `native_datafusion` when disabling `PARQUET_FILTER_PUSHDOWN_ENABLED`, so I
have moved this to dra
andygrove commented on PR #1619:
URL:
https://github.com/apache/datafusion-comet/pull/1619#issuecomment-2797156817
I addressed the feedback but I no longer see a performance improvement with
`native_datafusion` when disabling `PARQUET_FILTER_PUSHDOWN_ENABLED`, so I have
moved this to draft
berkaysynnada commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2796865438
> One more question: it seems like in all cases we end up eagerly cloning
every node: `Join::try_new`. If I understand correctly this may even happen
twice per node as we do th
ch-sc commented on PR #14523:
URL: https://github.com/apache/datafusion/pull/14523#issuecomment-2797103457
Sorry @berkaysynnada, I got side-tracked from this yesterday. The test is
fixed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
andygrove commented on issue #1639:
URL:
https://github.com/apache/datafusion-comet/issues/1639#issuecomment-2797061885
It looks like this error is happening in Spark code and not in Comet code?
It is difficult to know how to help with this since you have a custom Spark
image.
--
1 - 100 of 167 matches
Mail list logo