Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
pepijnve commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2950129956 One performance aspect I've been looking at is the cost of yielding. There's no magic as far as I can tell. Returning a Pending simply leads to a full unwind of the call stack by vi

Re: [I] [substrait] [sqllogictest] Unsupported cast type: Duration [datafusion]

2025-06-06 Thread via GitHub
jkosh44 commented on issue #16285: URL: https://github.com/apache/datafusion/issues/16285#issuecomment-2950146627 The query that fails looks like ```sql create table foo (val int, ts1 timestamp, ts2 timestamp, i interval) ... SELECT val, ts1 - ts2 FROM foo ORDER BY ts2 - ts1; ```

Re: [I] [substrait] [sqllogictest] Unsupported cast type: Float16 [datafusion]

2025-06-06 Thread via GitHub
jatin510 commented on issue #16298: URL: https://github.com/apache/datafusion/issues/16298#issuecomment-2950117043 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] [substrait] [sqllogictest] Unsupported cast type: Duration [datafusion]

2025-06-06 Thread via GitHub
jkosh44 commented on issue #16285: URL: https://github.com/apache/datafusion/issues/16285#issuecomment-2950167607 Of course, yet another solution would be to add the Duration type to substrait, but they'd need to be interested in doing that. -- This is an automated message from the Apache

Re: [PR] chore: Update documentation and ignore Spark SQL tests for known issue with count distinct on NaN in aggregate [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove commented on PR #1847: URL: https://github.com/apache/datafusion-comet/pull/1847#issuecomment-2950239412 Thanks for the review @parthchandra. I will go ahead and merge this and then re-enable the tests once we upgrade to DataFusion 48 -- This is an automated message from the Ap

Re: [PR] chore: Update documentation and ignore Spark SQL tests for known issue with count distinct on NaN in aggregate [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove merged PR #1847: URL: https://github.com/apache/datafusion-comet/pull/1847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[PR] fix: Update broadcast exchange logic to support reused exchanges [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove opened a new pull request, #1858: URL: https://github.com/apache/datafusion-comet/pull/1858 ## Which issue does this PR close? N/A ## Rationale for this change This fix was needed to fix some Spark SQL test failures in https://github.com/apache/

Re: [I] Update or ignore tests in Spark SQL WholeStageCodegenSuite [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove commented on issue #1852: URL: https://github.com/apache/datafusion-comet/issues/1852#issuecomment-2950335615 Comet does not support codegen, so these tests seem irrelevant. @kazuyukitanimura @parthchandra, is there any objection to adding `IgnoreComet` to these tests? --

Re: [PR] fix: Remove `COMET_SHUFFLE_FALLBACK_TO_COLUMNAR` hack [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove commented on PR #1736: URL: https://github.com/apache/datafusion-comet/pull/1736#issuecomment-2950371179 Test now pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] fix: [branch-48] Revert "Improve performance of constant aggregate window expression" [datafusion]

2025-06-06 Thread via GitHub
alamb merged PR #16307: URL: https://github.com/apache/datafusion/pull/16307 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Improve performance of constant aggregate window expression [datafusion]

2025-06-06 Thread via GitHub
alamb commented on PR #16234: URL: https://github.com/apache/datafusion/pull/16234#issuecomment-2950411440 > I suggest we revert this PR for now and then add more tests based on the failing tests in Spark/Comet so that we can have more confidence when the PR is updated. Update:@andyg

Re: [I] Panic in `datafusion_expr::window_state::WindowAggState::update` [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16308: URL: https://github.com/apache/datafusion/issues/16308#issuecomment-2950422006 I also added this ticket to the list of things we need to do on DataFusion 49 prior to release - https://github.com/apache/datafusion/issues/16235 -- This is an automated mes

Re: [PR] feat: add metadata to literal expressions [datafusion]

2025-06-06 Thread via GitHub
andygrove commented on PR #16170: URL: https://github.com/apache/datafusion/pull/16170#issuecomment-2950425160 @alamb @xudong963 I think that we can include this in the next DF 48 rc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] Panic in `datafusion_expr::window_state::WindowAggState::update` [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16308: URL: https://github.com/apache/datafusion/issues/16308#issuecomment-2950399925 We reverted the change in DF 48: - https://github.com/apache/datafusion/pull/16307 We can focus on fixing it for real for DataFusion 49.0.0 FYI @suibianwanwank would yo

Re: [PR] fix: Update broadcast exchange logic to support reused exchanges [datafusion-comet]

2025-06-06 Thread via GitHub
codecov-commenter commented on PR #1858: URL: https://github.com/apache/datafusion-comet/pull/1858#issuecomment-2950470900 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1858?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Spark Test fails `vectorized reader: missing all struct fields` [datafusion-comet]

2025-06-06 Thread via GitHub
parthchandra commented on issue #1843: URL: https://github.com/apache/datafusion-comet/issues/1843#issuecomment-2950544301 Sure. FWI, I also think it is acceptable to document this as a incompatible result and leave it at that. -- This is an automated message from the Apache Git Service

[PR] chore: Ignore Spark SQL WholeStageCodegenSuite tests [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove opened a new pull request, #1859: URL: https://github.com/apache/datafusion-comet/pull/1859 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/1852 ## Rationale for this change `WholeStageCodegenSuite` con

Re: [PR] MySQL: `[[NOT] ENFORCED]` in CHECK constraint [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio commented on code in PR #1870: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1870#discussion_r2131631473 ## src/parser/mod.rs: ## @@ -8134,7 +8134,19 @@ impl<'a> Parser<'a> { self.expect_token(&Token::LParen)?; let expr = B

Re: [PR] MySQL: Support `index_name` in FK constraints [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio merged PR #1871: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1871 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [D] Expose intermediary states in aggregation functions [datafusion]

2025-06-06 Thread via GitHub
GitHub user mgrenonville added a comment to the discussion: Expose intermediary states in aggregation functions Thanks for your reply. I had the big picture but the documentation made me realize that I "just" have to write two UDFA: - The first is the `-State`, that uses the original function

Re: [PR] Postgres: Apply `ONLY` keyword per table in TRUNCATE stmt [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio merged PR #1872: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1872 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[PR] chore: update DF48 changelog [datafusion]

2025-06-06 Thread via GitHub
xudong963 opened a new pull request, #16269: URL: https://github.com/apache/datafusion/pull/16269 ## Which issue does this PR close? - Include the latest changes in main ## Rationale for this change ## What changes are included in this PR? #

Re: [PR] chore: update DF48 changelog [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on PR #16269: URL: https://github.com/apache/datafusion/pull/16269#issuecomment-2948325952 @2010YOUY01 Sorry, I just merged the last PR to the wrong branch. Let's do it again -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] feat: Hive: support `SORT BY` direction [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio merged PR #1873: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Fix `CASE` expression spans [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio commented on code in PR #1874: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1874#discussion_r2131646198 ## tests/sqlparser_common.rs: ## @@ -14464,6 +14468,16 @@ fn test_case_statement_span() { ); } +#[test] Review Comment: Could we move the te

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2948577456 One subtle thing found in https://github.com/datafusion-contrib/datafusion-materialized-views/pull/61#discussion_r2131717796, I've opened a separate issue: https://github.com

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2948600758 > > [#16267](https://github.com/apache/datafusion/pull/16267) After it's merged, I'll push the 48.0.0-rc2 and start vote > > No issues were found when testing `48.0.0-rc

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948627784 Updated, the current code mapping Char/Text/String to Utf8, so we should change all to make it consistent? ```rust SQLDataType::Char(_) | SQLDataType::Text | SQLDat

[I] Mapping Char/Text/String default to Utf8View [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas opened a new issue, #16288: URL: https://github.com/apache/datafusion/issues/16288 ### Is your feature request related to a problem or challenge? Currently, we already supported mapping sql varchar default to Utf8View, this ticket want to support for Char/Text/String defau

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948636444 I think so, if `map_varchar_to_utf8view` is true, the code should be: ```sql SQLDataType::Char(_) | SQLDataType::Text | SQLDataType::String(_) => { Ok(DataTyp

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2948780089 @zhuqi-lucas, I didn't get a chance to take a deep look as I'm traveling, but a cursory look suggests the only open issue is with the filter test. Is that right? -- This is an a

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948779695 Thanks @xudong963 and @zhuqi-lucas -- this makes sense to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
timsaucer commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2948783313 The one I was hoping to get in but didn’t make it before this vote started was https://github.com/apache/datafusion/pull/16170 This was the last of the breaking metadata

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948751023 No rush, it’s okay to include it in next round -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948797108 It seems we are tracking the work in - https://github.com/apache/datafusion/issues/16288 So perhaps we can close this ticket -- This is an automated message from the A

Re: [I] Adding correct file extension [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16260: URL: https://github.com/apache/datafusion/issues/16260#issuecomment-2948792339 I agree this would be a great improvement -- thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949112371 @zhuqi-lucas, I took the liberty of parametrizing the tests and hardening them. We now have two failing tests (the filter test, which is probably trivial to fix and more related to

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949124590 > @zhuqi-lucas, I took the liberty of parametrizing the tests and hardening them. We now have two failing (ignored) tests (the filter test, which is probably trivial to fix and m

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on code in PR #16196: URL: https://github.com/apache/datafusion/pull/16196#discussion_r2132112459 ## datafusion/physical-optimizer/src/wrap_leaves_cancellation.rs: ## @@ -76,7 +77,8 @@ impl WrapLeaves { plan: Arc, yield_frequency: usize,

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949140262 There are two versions of the join test, one with and and one without aggregation (`test_infinite_join_cancel`). So the reason it is passing is something else (maybe the presence o

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
pepijnve commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949146871 @zhuqi-lucas this sort-merge test might also be useful to integrate https://github.com/pepijnve/datafusion/blob/cancel_safety/datafusion/core/tests/execution/yielding.rs#L373 I can

Re: [PR] chore: Upgrade to DataFusion 48.0.0-rc1 [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove closed pull request #1842: chore: Upgrade to DataFusion 48.0.0-rc1 URL: https://github.com/apache/datafusion-comet/pull/1842 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[PR] chore: Upgrade to DataFusion 48.0.0-rc2 [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove opened a new pull request, #1853: URL: https://github.com/apache/datafusion-comet/pull/1853 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[PR] Improve ability to cancel queries quickly [datafusion]

2025-06-06 Thread via GitHub
pepijnve opened a new pull request, #16301: URL: https://github.com/apache/datafusion/pull/16301 ## Which issue does this PR close? - Closes #16193 - Closes #15314 - Closes #14036 This PR is an alternative 'operator intrusive' solution based on the work initially done in

Re: [I] Support Glob Expressions for S3 [datafusion]

2025-06-06 Thread via GitHub
a-agmon commented on issue #7393: URL: https://github.com/apache/datafusion/issues/7393#issuecomment-2949329591 > Here is a proposed way to do this: > > * [Support reading multiple parquet files via `datafusion-cli`  #16303](https://github.com/apache/datafusion/issues/16303) Loo

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949371256 > I am wandering a easy solution(may be), we just remove final emissiontype check in the rule, and add yield based leaf nodes, it seems can solve all our problems: @zhuqi-lu

[I] Automatically detect S3 region if it is not specified [datafusion]

2025-06-06 Thread via GitHub
alamb opened a new issue, #16306: URL: https://github.com/apache/datafusion/issues/16306 ### Is your feature request related to a problem or challenge? - Part of https://github.com/apache/datafusion/issues/13456 I would like to make it easy to use datafusion-cli to query files o

[I] Relax sort fallback constraints [datafusion-comet]

2025-06-06 Thread via GitHub
mbutrovich opened a new issue, #1854: URL: https://github.com/apache/datafusion-comet/issues/1854 ### What is the problem the feature request solves? I came across this TODO in the code base: https://github.com/apache/datafusion-comet/blob/87ef44cb05af5e8e4af7ba7f35b5aca77cf60753/s

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
XiangpengHao commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2949445945 No issue found from Parquet Viewer and LiquidCache! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
andygrove commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2949450136 Upgrading Comet to use rc2 causes tests to fail with a `attempt to subtract with overflow` panic. This did not happen with rc1. I have not debugged this yet to find the root c

Re: [I] `datafusion-cli`: Use correct S3 region if it is not specified [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16306: URL: https://github.com/apache/datafusion/issues/16306#issuecomment-2949467314 - Filed https://github.com/apache/arrow-rs-object-store/issues/402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949486347 @zhuqi-lucas it may make sense at this point to add built-in yielding to the remaining two sources so that you don't have to deal with the diffs -- This is an automated message f

Re: [I] Spark Test fails `vectorized reader: missing all struct fields` [datafusion-comet]

2025-06-06 Thread via GitHub
comphead commented on issue #1843: URL: https://github.com/apache/datafusion-comet/issues/1843#issuecomment-2949482569 @parthchandra that would be my assumption too, but apparently Spark thinks if there is no overlap between columns in actual and expected schema just return null. I'll atta

[PR] fix: [branch-48] Revert "Improve performance of constant aggregate window expression" [datafusion]

2025-06-06 Thread via GitHub
andygrove opened a new pull request, #16307: URL: https://github.com/apache/datafusion/pull/16307 This reverts commit 0c3037404929fc3a3c4fbf6b9b7325d422ce10bd. ## Which issue does this PR close? N/A ## Rationale for this change There is a regression

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949490754 > @zhuqi-lucas it may make sense at this point to add built-in yielding to the remaining two sources so that you don't have to deal with the diffs I agree @ozankabak , good

Re: [I] [DISCUSSION] Make it easier and faster to query remote files (S3, iceberg, etc) [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #13456: URL: https://github.com/apache/datafusion/issues/13456#issuecomment-2949489391 I got nerd sniped this morning and filed a bunch of ideas on how to improve the experience: - [ ] https://github.com/apache/datafusion/issues/16302 - [ ] https://github.com/a

Re: [PR] Track peak_mem_used in ExternalSorter [datafusion]

2025-06-06 Thread via GitHub
ding-young commented on PR #16192: URL: https://github.com/apache/datafusion/pull/16192#issuecomment-2949499201 @2010YOUY01 Thanks for your help! I’m currently working on a different issue (spill file compression option) meanwhile. Feel free to ping me if you'd like me to clarify any of the

[PR] [ignore] Debug regression in 48.0.0-rc2 [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove opened a new pull request, #1855: URL: https://github.com/apache/datafusion-comet/pull/1855 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[I] Panic in `datafusion_expr::window_state::WindowAggState::update` [datafusion]

2025-06-06 Thread via GitHub
andygrove opened a new issue, #16308: URL: https://github.com/apache/datafusion/issues/16308 ### Describe the bug Upgrading Comet to use rc2 causes tests to fail with a `attempt to subtract with overflow` panic. This did not happen with rc1. I have not debugged this yet to find the r

[I] [substrait] [sqllogictest] Unsupported producing row from empty relation [datafusion]

2025-06-06 Thread via GitHub
gabotechs opened a new issue, #16271: URL: https://github.com/apache/datafusion/issues/16271 ### Describe the bug ~1800 sqllogictest cases are failing in Substrait round-trip mode because of the following error: ``` query failed: DataFusion error: This feature is not implemented

[I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
xudong963 opened a new issue, #16277: URL: https://github.com/apache/datafusion/issues/16277 ``` DataFusion CLI v48.0.0 > create table t1(a varchar, b char); 0 row(s) fetched. Elapsed 0.009 seconds. > show columns from t1; +---+--++-

[I] [substrait] [sqllogictest] Unsupported cast type: FixedSizeList [datafusion]

2025-06-06 Thread via GitHub
gabotechs opened a new issue, #16278: URL: https://github.com/apache/datafusion/issues/16278 ### Describe the bug ~25 sqllogictest cases are failing in Substrait round-trip mode because of the following error: ``` query failed: DataFusion error: This feature is not implemented:

Re: [I] Question about the `map_varchar_to_utf8view` config [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on issue #16277: URL: https://github.com/apache/datafusion/issues/16277#issuecomment-2948444898 cc @zhuqi-lucas @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[I] [substrait] [sqllogictest] Cannot convert to Substrait [datafusion]

2025-06-06 Thread via GitHub
gabotechs opened a new issue, #16281: URL: https://github.com/apache/datafusion/issues/16281 ### Describe the bug ~30 sqllogictest cases are failing in Substrait round-trip mode because of the following error: ``` query failed: DataFusion error: This feature is not implemented:

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
shehabgamin commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2948560397 > [#16267](https://github.com/apache/datafusion/pull/16267) After it's merged, I'll push the 48.0.0-rc2 and start vote No issues were found when testing `48.0.0-rc1` o

[I] [substrait] [sqllogictest] Error during planning: No table named 'tmp_table' [datafusion]

2025-06-06 Thread via GitHub
gabotechs opened a new issue, #16279: URL: https://github.com/apache/datafusion/issues/16279 ### Describe the bug ~35 sqllogictest cases are failing in Substrait round-trip mode because of the following error: ``` query failed: DataFusion error: Error during planning: No table n

[I] [substrait] [sqllogictest] Substrait schema timestamp field mismatch [datafusion]

2025-06-06 Thread via GitHub
gabotechs opened a new issue, #16283: URL: https://github.com/apache/datafusion/issues/16283 ### Describe the bug ~10 sqllogictest cases are failing in Substrait round-trip mode because of the following error: ``` query failed: DataFusion error: Substrait error: Field 'ts_nano_e

Re: [PR] chore: update DF48 changelog [datafusion]

2025-06-06 Thread via GitHub
xudong963 commented on PR #16269: URL: https://github.com/apache/datafusion/pull/16269#issuecomment-2948580957 Thank you @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] chore: update DF48 changelog [datafusion]

2025-06-06 Thread via GitHub
xudong963 merged PR #16269: URL: https://github.com/apache/datafusion/pull/16269 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
pepijnve commented on code in PR #16196: URL: https://github.com/apache/datafusion/pull/16196#discussion_r2131883855 ## datafusion/physical-optimizer/src/wrap_leaves_cancellation.rs: ## @@ -76,7 +77,8 @@ impl WrapLeaves { plan: Arc, yield_frequency: usize,

[I] Support reading multiple parquet files via `datafusion-cli` [datafusion]

2025-06-06 Thread via GitHub
alamb opened a new issue, #16303: URL: https://github.com/apache/datafusion/issues/16303 ### Is your feature request related to a problem or challenge? This is an idea that @robtandy brought up on the DataFusion sync call the other day and I think it would be pretty useful. The

Re: [I] Support Glob Expressions for S3 [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #7393: URL: https://github.com/apache/datafusion/issues/7393#issuecomment-2949320296 Here is a proposed way to do this: - https://github.com/apache/datafusion/issues/16303 -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [I] Make it easier to query parquet files on remote storage with `datafusion-cli` [datafusion]

2025-06-06 Thread via GitHub
alamb closed issue #16304: Make it easier to query parquet files on remote storage with `datafusion-cli` URL: https://github.com/apache/datafusion/issues/16304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[I] Make it easier to query parquet files on remote storage with `datafusion-cli` [datafusion]

2025-06-06 Thread via GitHub
alamb opened a new issue, #16304: URL: https://github.com/apache/datafusion/issues/16304 ### Is your feature request related to a problem or challenge? - Part of https://github.com/apache/datafusion/issues/13456 There are more and more blogs like [this](https://altinity.com/blo

Re: [PR] chore: Upgrade to DataFusion 48.0.0-rc2 [datafusion-comet]

2025-06-06 Thread via GitHub
codecov-commenter commented on PR #1853: URL: https://github.com/apache/datafusion-comet/pull/1853#issuecomment-2949360752 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1853?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Make it easier to query parquet files on remote storage with `datafusion-cli` [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #16304: URL: https://github.com/apache/datafusion/issues/16304#issuecomment-2949347594 Actually, let's use https://github.com/apache/datafusion/issues/13456 to track this -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
pepijnve commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949117788 > Tests pass even if add in the "pretending" (because the join code seems to yield naturally) The hash join test I have does fail so I dug into this. It's passing for you for

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949226146 > @zhuqi-lucas this sort-merge test might also be useful to integrate https://github.com/pepijnve/datafusion/blob/cancel_safety/datafusion/core/tests/execution/yielding.rs#L373 I

[I] Improved experience when remote object store URL does not end in `/` [datafusion]

2025-06-06 Thread via GitHub
alamb opened a new issue, #16302: URL: https://github.com/apache/datafusion/issues/16302 ### Is your feature request related to a problem or challenge? - part of https://github.com/apache/datafusion/issues/13456 - related to https://github.com/apache/datafusion/issues/16299 I

Re: [PR] Fix `CASE` expression spans [datafusion-sqlparser-rs]

2025-06-06 Thread via GitHub
iffyio merged PR #1874: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[PR] Fix inconsistent schema projection in ListingTable when file order varies by tracking schema source [datafusion]

2025-06-06 Thread via GitHub
kosiew opened a new pull request, #16305: URL: https://github.com/apache/datafusion/pull/16305 ## Which issue does this PR close? - Closes #16270 ## Rationale for this change The current behavior of `ListingTable` in DataFusion can produce inconsistent projected schemas

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949387541 > > I am wandering a easy solution(may be), we just remove final emissiontype check in the rule, and add yield based leaf nodes, it seems can solve all our problems: > > @

Re: [I] Panic in `datafusion_expr::window_state::WindowAggState::update` [datafusion]

2025-06-06 Thread via GitHub
andygrove commented on issue #16308: URL: https://github.com/apache/datafusion/issues/16308#issuecomment-2949516445 I also see a correctness issue in another test related to windowed aggregates: ``` 2025-06-06T14:15:31.2495550Z [info] - postgreSQL/window_part1.sql *** FAILED *** (

Re: [PR] Improve performance of constant aggregate window expression [datafusion]

2025-06-06 Thread via GitHub
alamb commented on PR #16234: URL: https://github.com/apache/datafusion/pull/16234#issuecomment-2949513136 We found a bug in this change during DataFusion 48 testing - https://github.com/apache/datafusion/issues/16308 if we can't fix it shortly I think we should back out this PR an

Re: [PR] feat: add metadata to literal expressions [datafusion]

2025-06-06 Thread via GitHub
paleolimbot commented on code in PR #16170: URL: https://github.com/apache/datafusion/pull/16170#discussion_r2132354694 ## datafusion/sqllogictest/test_files/array.slt: ## @@ -6061,7 +6061,7 @@ physical_plan 04)--AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]

Re: [PR] feat: add metadata to literal expressions [datafusion]

2025-06-06 Thread via GitHub
paleolimbot commented on code in PR #16170: URL: https://github.com/apache/datafusion/pull/16170#discussion_r2132357484 ## datafusion/expr/src/expr.rs: ## @@ -274,16 +275,16 @@ use sqlparser::ast::{ /// assert!(rewritten.transformed); /// // to 42 = 5 AND b = 6 /// assert_eq!

[I] Release Comet 0.9.0 [datafusion-comet]

2025-06-06 Thread via GitHub
andygrove opened a new issue, #1856: URL: https://github.com/apache/datafusion-comet/issues/1856 ### What is the problem the feature request solves? I would like to start planning the Comet 0.9.0 release. Let's use this issue to co-ordinate on any issues or PRs to resolve for the rele

Re: [I] Release DataFusion `48.0.0` (June 2025) [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #15771: URL: https://github.com/apache/datafusion/issues/15771#issuecomment-2950586335 Update: - @andygrove reverted the regression in the `release-48`: https://github.com/apache/datafusion/pull/16307 - I am pretty sure the upgrade works for Delta.rs: https:/

[I] Enter tokio runtime during other FFI calls, such as execute [datafusion]

2025-06-06 Thread via GitHub
timsaucer opened a new issue, #16312: URL: https://github.com/apache/datafusion/issues/16312 Please see the discussion in the original post below. Some users wish to spawn tasks during calls like `execute` and others. For pure rust implementations without FFI this isn't a problem. How

Re: [D] Should ExecutionPlan spawn tasks in `execute` function [datafusion]

2025-06-06 Thread via GitHub
GitHub user timsaucer added a comment to the discussion: Should ExecutionPlan spawn tasks in `execute` function I have converted this discussion into this issue to track correction: https://github.com/apache/datafusion/issues/16312 GitHub link: https://github.com/apache/datafusion/discussion

Re: [PR] chore: Ignore Spark SQL WholeStageCodegenSuite tests [datafusion-comet]

2025-06-06 Thread via GitHub
codecov-commenter commented on PR #1859: URL: https://github.com/apache/datafusion-comet/pull/1859#issuecomment-2950689342 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1859?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] Minor: Add upgrade guide for `Expr::WindowFunction` [datafusion]

2025-06-06 Thread via GitHub
alamb opened a new pull request, #16313: URL: https://github.com/apache/datafusion/pull/16313 ## Which issue does this PR close? - Part of https://github.com/apache/datafusion/issues/15771 - Related to https://github.com/apache/datafusion/pull/16207 ## Rationale for this chan

Re: [PR] Extend benchmark comparison script with more detailed statistics [datafusion]

2025-06-06 Thread via GitHub
alamb commented on PR #16262: URL: https://github.com/apache/datafusion/pull/16262#issuecomment-2950701725 🚀 thank you @pepijnve @zhuqi-lucas and @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] Support distribution as a MetricValue in ExecutionPlan [datafusion]

2025-06-06 Thread via GitHub
alamb closed issue #16044: Support distribution as a MetricValue in ExecutionPlan URL: https://github.com/apache/datafusion/issues/16044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] feat: Support defining custom MetricValues in PhysicalPlans [datafusion]

2025-06-06 Thread via GitHub
alamb commented on PR #16195: URL: https://github.com/apache/datafusion/pull/16195#issuecomment-2950705278 We have now made the release-48 branch so what is merged into main will be released as part of DataFusion 49.0.0 -- This is an automated message from the Apache Git Service. To respo

Re: [PR] feat: Support defining custom MetricValues in PhysicalPlans [datafusion]

2025-06-06 Thread via GitHub
alamb merged PR #16195: URL: https://github.com/apache/datafusion/pull/16195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] General framework to decorrelate the subqueries [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #5492: URL: https://github.com/apache/datafusion/issues/5492#issuecomment-2950729008 I am sorry I have not had a chance to review this. I will try and find time over the weekend, but sadly I have several other projects that are higher priority than subqueries to att

Re: [PR] [MAJOR] Equivalence System Overhaul [datafusion]

2025-06-06 Thread via GitHub
alamb commented on PR #16217: URL: https://github.com/apache/datafusion/pull/16217#issuecomment-2950730918 > I removed the conflicts, this is in a good state now. Did we get to the 48 cut-off point yet? We are close I think -- This is an automated message from the Apache Git Servic

Re: [PR] feat: Allow cancelling of grouping operations which are CPU bound [datafusion]

2025-06-06 Thread via GitHub
ozankabak commented on code in PR #16196: URL: https://github.com/apache/datafusion/pull/16196#discussion_r2132866332 ## datafusion/datasource/src/source.rs: ## @@ -179,12 +180,17 @@ pub trait DataSource: Send + Sync + Debug { /// the [`FileSource`] trait. /// /// [`FileSourc

Re: [I] Support Glob Expressions for S3 [datafusion]

2025-06-06 Thread via GitHub
alamb commented on issue #7393: URL: https://github.com/apache/datafusion/issues/7393#issuecomment-2950720194 > Looks good. Thanks LOL now I am just 🎣 for people to actually write the code ;) -- This is an automated message from the Apache Git Service. To respond to the message, pl

  1   2   3   >