Re: [PR] fix: unparse join without projection [datafusion]

2025-04-16 Thread via GitHub
goldmedal commented on PR #15693: URL: https://github.com/apache/datafusion/pull/15693#issuecomment-2809403713 Thanks @chenkovsky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] fix: unparse join without projection [datafusion]

2025-04-16 Thread via GitHub
goldmedal merged PR #15693: URL: https://github.com/apache/datafusion/pull/15693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Unparse of Joins is ignoring projections [datafusion]

2025-04-16 Thread via GitHub
goldmedal closed issue #15688: Unparse of Joins is ignoring projections URL: https://github.com/apache/datafusion/issues/15688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[PR] chore(deps): bump indexmap from 2.8.0 to 2.9.0 [datafusion]

2025-04-16 Thread via GitHub
dependabot[bot] opened a new pull request, #15732: URL: https://github.com/apache/datafusion/pull/15732 Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.8.0 to 2.9.0. Changelog Sourced from https://github.com/indexmap-rs/indexmap/blob/main/RELEASES.md";>indexmap's

Re: [I] When `datafusion.execution.parquet.coerce_int96` is set, timestamp type is still reported as Timestamp(nanoseconds) [datafusion]

2025-04-16 Thread via GitHub
chenkovsky commented on issue #15721: URL: https://github.com/apache/datafusion/issues/15721#issuecomment-2808779352 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] fix: serialize listing table without partition column [datafusion]

2025-04-16 Thread via GitHub
chenkovsky opened a new pull request, #15737: URL: https://github.com/apache/datafusion/pull/15737 ## Which issue does this PR close? - Closes #15718. ## Rationale for this change partition columns should exclude file schema in listing table ## What changes are inc

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2047845110 ## datafusion/sql/src/parser.rs: ## @@ -469,9 +473,101 @@ impl<'a> DFParser<'a> { } _ => { //

Re: [PR] feat: add `with_group_indices_order_mode` function for `GroupsAccumulator` to help create specialized impl [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15022: URL: https://github.com/apache/datafusion/pull/15022#issuecomment-2811287282 I will try and review this one soon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
adriangb commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2811168067 @berkaysynnada I updated the SLT tests. I'll have another review tomorrow but things I'd like to point out now: 1. We should still think about the `retry` parameter. Ideally w

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047767596 ## src/ast/ddl.rs: ## @@ -2272,6 +2277,10 @@ impl fmt::Display for CreateFunction { if let Some(CreateFunctionBody::AsAfterOptions(function_

Re: [PR] Avoid computing unnecessary statstics [datafusion]

2025-04-16 Thread via GitHub
xudong963 merged PR #15729: URL: https://github.com/apache/datafusion/pull/15729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2047845110 ## datafusion/sql/src/parser.rs: ## @@ -469,9 +473,101 @@ impl<'a> DFParser<'a> { } _ => { //

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2047845110 ## datafusion/sql/src/parser.rs: ## @@ -469,9 +473,101 @@ impl<'a> DFParser<'a> { } _ => { //

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2047848388 ## datafusion/sql/src/parser.rs: ## Review Comment: This whole code modification will be turn into several functions for code maintainability, I'm leaving

Re: [I] [BUG] Error when adding Date32 and Int64 [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on issue #12342: URL: https://github.com/apache/datafusion/issues/12342#issuecomment-2811368679 Ah, looks like datafusion doesn't even support this `to_date('1970-01-01', '-mm-dd');` in v46.0.1 ```SQL > select to_date('1970-01-01', '-mm-dd'); Executi

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2047848388 ## datafusion/sql/src/parser.rs: ## Review Comment: This whole code modification will be turn into several functions for code readability and maintainabil

[PR] Improve push down limit [datafusion]

2025-04-16 Thread via GitHub
xudong963 opened a new pull request, #15744: URL: https://github.com/apache/datafusion/pull/15744 ## Which issue does this PR close? - Closes #. ## Rationale for this change If skip is zero, we can directly remove the limit, the current behavior is to rem

Re: [I] Maybe session memory leak [datafusion-ballista]

2025-04-16 Thread via GitHub
mmooyyii closed issue #1242: Maybe session memory leak URL: https://github.com/apache/datafusion-ballista/issues/1242 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

Re: [I] Release DataFusion `47.0.0` (April 2025) [datafusion]

2025-04-16 Thread via GitHub
alamb commented on issue #15072: URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2810656117 Note that I will be away starting April 18, and so likely can not complete the vote / release process until April 26. @andygrove would it be possible for you to complete the voti

Re: [PR] Improve push down limit [datafusion]

2025-04-16 Thread via GitHub
xudong963 commented on PR #15744: URL: https://github.com/apache/datafusion/pull/15744#issuecomment-2811894336 The failing tests are related to topk (in the user_defined_plan.rs). Because the PR removes the limit during the first round, so `TopKOptimizerRule` doesn't have a chance to

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
berkaysynnada commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809507083 > @berkaysynnada I merged your change. Still have some failing tests. Also as I said in [pydantic#26 (comment)](https://github.com/pydantic/datafusion/pull/26#discussion_r20467

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
adriangb commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809500404 @berkaysynnada I merged your change. Still have some failing tests. Also as I said in https://github.com/pydantic/datafusion/pull/26#discussion_r2046768875 the `retry` flag needs ei

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
Dandandan commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2809508850 So to change it in your diff (didn't change the documentation). I would like to keep the original `StreamingMergeBuilder` case and the `if self.reservation.size() < self.sort

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
adriangb commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809515533 Great! Btw I gave you write access to our fork so you should be able to push to this branch directly -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
berkaysynnada commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809540717 @adriangb one point doesn't come to me obvious. I've erased a comment saying that "filter predicates reflect the output schema after applying the projection", and some tests ar

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
berkaysynnada commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809547292 > @adriangb one point doesn't come to me obvious. I've erased a comment saying that "filter predicates reflect the output schema after applying the projection", and some tests

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
berkaysynnada commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809563180 I've also another question. When parquet_options.pushdown_filters is true, we can pushdown filters and remove the FilterExec's from the plan. When parquet_options.pushdo

Re: [PR] Coerce and simplify FixedSizeBinary equality to literal binary [datafusion]

2025-04-16 Thread via GitHub
leoyvens commented on PR #15726: URL: https://github.com/apache/datafusion/pull/15726#issuecomment-2809573254 I now realize I the `ExprSimplifier` case I added was entirely redundant with the `unwrap_cast_in_comparison` case. I've removed it and just kept the coercion and test cases. --

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
adriangb commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809577101 > @adriangb one point doesn't come to me obvious. I've erased a comment saying that "filter predicates reflect the output schema after applying the projection", and some tests are w

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
berkaysynnada commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809586017 > > @adriangb one point doesn't come to me obvious. I've erased a comment saying that "filter predicates reflect the output schema after applying the projection", and some test

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

2025-04-16 Thread via GitHub
adriangb commented on PR #15566: URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2809592722 > I've also another question. When parquet_options.pushdown_filters is true, we can pushdown filters and remove the FilterExec's from the plan. When parquet_options.pushdown_filters

Re: [PR] chore: correct name of pipelines for native_datafusion ci workflow [datafusion-comet]

2025-04-16 Thread via GitHub
andygrove merged PR #1653: URL: https://github.com/apache/datafusion-comet/pull/1653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] [wip] Add scripts for running benchmarks on EC2 [datafusion-comet]

2025-04-16 Thread via GitHub
codecov-commenter commented on PR #1654: URL: https://github.com/apache/datafusion-comet/pull/1654#issuecomment-2809855816 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1654?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on PR #66: URL: https://github.com/apache/datafusion-site/pull/66#issuecomment-2809867849 > Thank you very much @Adez017 > > I took another pass through this post and did > > 1. some small wordsmithing and formatting tweaks. > 2. Added a section on how windo

Re: [PR] Refactor regexp slt tests [datafusion]

2025-04-16 Thread via GitHub
comphead commented on PR #15709: URL: https://github.com/apache/datafusion/pull/15709#issuecomment-2809885498 Thanks, As long as test and its dependencies runs in the separate context we are safe. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] feat: track unified memory pool [datafusion-comet]

2025-04-16 Thread via GitHub
andygrove merged PR #1651: URL: https://github.com/apache/datafusion-comet/pull/1651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047186316 ## content/blog/2025-04-04-datafusion-userdefined-window-functions.md: ## @@ -0,0 +1,339 @@ +--- +layout: post +title: User defined Window Functions in DataFusion

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047191979 ## content/blog/2025-04-17-user-defined-window-functions.md: ## @@ -0,0 +1,427 @@ +--- +layout: post +title: User defined Window Functions in DataFusion +date: 202

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
zhuqi-lucas commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2809956328 Thank you @Dandandan , i submit the first version with both fast and no regression at the same time. Benchmark sort_tpch1.json

Re: [I] CLI query result header for cast expressions with literals is confusing [datafusion]

2025-04-16 Thread via GitHub
alamb commented on issue #5221: URL: https://github.com/apache/datafusion/issues/5221#issuecomment-2810217274 @qstommyshu it sounds like a reasonable idea -- it seems like a PR (as you seem to have to created https://github.com/apache/datafusion/pull/15736) is probably the best way. I'll t

Re: [PR] Fix local preview link in README [datafusion-site]

2025-04-16 Thread via GitHub
alamb merged PR #69: URL: https://github.com/apache/datafusion-site/pull/69 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusio

Re: [PR] Fix local preview link in README [datafusion-site]

2025-04-16 Thread via GitHub
alamb commented on PR #69: URL: https://github.com/apache/datafusion-site/pull/69#issuecomment-2810236033 Thanks @kevinjqliu and @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Refactor regexp slt tests [datafusion]

2025-04-16 Thread via GitHub
Omega359 commented on PR #15709: URL: https://github.com/apache/datafusion/pull/15709#issuecomment-2809757873 > > @comphead my understanding is that when a test file includes another file using the `include` directive (like `include ./init_data.slt.part`), the sqllogictest runner would proc

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047121149 ## content/blog/2025-04-17-user-defined-window-functions.md: ## @@ -0,0 +1,427 @@ +--- +layout: post +title: User defined Window Functions in DataFusion +date: 202

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047119302 ## content/blog/2025-04-04-datafusion-userdefined-window-functions.md: ## @@ -0,0 +1,346 @@ +--- +layout: post +title: User defined Window Functions in DataFusion

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047124587 ## content/blog/2025-04-04-datafusion-userdefined-window-functions.md: ## @@ -0,0 +1,154 @@ +--- +layout: post +title: User defined Window Functions in DataFusion

Re: [PR] Minor: include output partition count of `RepartitionExec` to tree explain [datafusion]

2025-04-16 Thread via GitHub
alamb merged PR #15717: URL: https://github.com/apache/datafusion/pull/15717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Minor: include output partition count of `RepartitionExec` to tree explain [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15717: URL: https://github.com/apache/datafusion/pull/15717#issuecomment-2809931598 Thanks @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on PR #66: URL: https://github.com/apache/datafusion-site/pull/66#issuecomment-2810035186 hi @alamb , i had done some changes about spells , etc . i think it is ready to be merged . i would prefer please look at the post one final time before merging . for me , its looks

Re: [PR] Update version to 47.0.0, add CHANGELOG [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15731: URL: https://github.com/apache/datafusion/pull/15731#issuecomment-2810147921 Let's go! Thank you @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Update version to 47.0.0, add CHANGELOG [datafusion]

2025-04-16 Thread via GitHub
alamb merged PR #15731: URL: https://github.com/apache/datafusion/pull/15731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Apply pre-selection and computation skipping to short-circuit optimization [datafusion]

2025-04-16 Thread via GitHub
acking-you commented on code in PR #15694: URL: https://github.com/apache/datafusion/pull/15694#discussion_r2047346420 ## datafusion/physical-expr/src/expressions/binary.rs: ## @@ -811,58 +822,199 @@ impl BinaryExpr { } } +enum ShortCircuitStrategy<'a> { +None, +

Re: [I] Release DataFusion `47.0.0` (April 2025) [datafusion]

2025-04-16 Thread via GitHub
alamb commented on issue #15072: URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2810161688 I just merged the version + changelog PR from @xudong963 - https://github.com/apache/datafusion/pull/15731 I also created a `branch-47` here for the release: - https

Re: [PR] Apply pre-selection and computation skipping to short-circuit optimization [datafusion]

2025-04-16 Thread via GitHub
acking-you commented on PR #15694: URL: https://github.com/apache/datafusion/pull/15694#issuecomment-2810163625 > The only thing I think is needed in this PR is a few more tests for the `pre_selection_scatter` function and then it will be ready to go done -- This is an auto

Re: [PR] Coerce and simplify FixedSizeBinary equality to literal binary [datafusion]

2025-04-16 Thread via GitHub
leoyvens commented on PR #15726: URL: https://github.com/apache/datafusion/pull/15726#issuecomment-2809641963 cc @jayzhan211, could I ask you for a review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] [wip] Add scripts for running benchmarks on EC2 [datafusion-comet]

2025-04-16 Thread via GitHub
andygrove opened a new pull request, #1654: URL: https://github.com/apache/datafusion-comet/pull/1654 ## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/1636 ## Rationale for this change Make it easier for anyone t

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
Adez017 commented on code in PR #66: URL: https://github.com/apache/datafusion-site/pull/66#discussion_r2047198708 ## content/blog/2025-04-17-user-defined-window-functions.md: ## @@ -0,0 +1,427 @@ +--- +layout: post +title: User defined Window Functions in DataFusion +date: 202

[PR] chore(deps-dev): bump http-proxy-middleware from 2.0.6 to 2.0.9 in /datafusion/wasmtest/datafusion-wasm-app [datafusion]

2025-04-16 Thread via GitHub
dependabot[bot] opened a new pull request, #15738: URL: https://github.com/apache/datafusion/pull/15738 Bumps [http-proxy-middleware](https://github.com/chimurai/http-proxy-middleware) from 2.0.6 to 2.0.9. Release notes Sourced from https://github.com/chimurai/http-proxy-middlewar

Re: [PR] Add support for `PRINT` statement for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1811: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1811#discussion_r2047240782 ## tests/sqlparser_mssql.rs: ## @@ -2053,3 +2053,37 @@ fn parse_drop_trigger() { } ); } + +#[test] +fn parse_print() { +let print_

Re: [PR] User defined window functions blog post [datafusion-site]

2025-04-16 Thread via GitHub
alamb commented on PR #66: URL: https://github.com/apache/datafusion-site/pull/66#issuecomment-2810174306 Thanks @Adez017 ! Let's give it another day or two for any remaining comments and then I'll plan to publish it -- This is an automated message from the Apache Git Service. To respond

Re: [I] Make it easier to run TPCH queries with datafusion-cli [datafusion]

2025-04-16 Thread via GitHub
alamb commented on issue #14608: URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2810241760 > [@alamb](https://github.com/alamb) Yes once I address the couple of prioritized issues I have open for `v1.0.0` the next step will be to work on the integration, I agree with ha

Re: [D] DISCUSSION: Anyone around for the Databricks Data & AI Summit in San Francisco June 9–12? [datafusion]

2025-04-16 Thread via GitHub
GitHub user alamb added a comment to the discussion: DISCUSSION: Anyone around for the Databricks Data & AI Summit in San Francisco June 9–12? Casual meet / greet sounds good to me too (though we could do that another night as well). I think it would be nice to meet somewhere that people who

Re: [PR] Enable setting default values for target_partitions and planning_concurrency [datafusion]

2025-04-16 Thread via GitHub
alamb merged PR #15712: URL: https://github.com/apache/datafusion/pull/15712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
Dandandan commented on code in PR #15380: URL: https://github.com/apache/datafusion/pull/15380#discussion_r2047534508 ## datafusion/physical-plan/src/sorts/sort.rs: ## @@ -673,29 +676,211 @@ impl ExternalSorter { return self.sort_batch_stream(batch, metrics, reserva

[I] Support more types when pruning Parquet data [datafusion]

2025-04-16 Thread via GitHub
etseidl opened a new issue, #15742: URL: https://github.com/apache/datafusion/issues/15742 ### Is your feature request related to a problem or challenge? I've been working on implementing a new `ColumnOrder` for floating point columns in Parquet (https://github.com/apache/arrow-rs/pul

Re: [PR] feat: Add ConfigOptions to ScalarFunctionArgs [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #13527: URL: https://github.com/apache/datafusion/pull/13527#issuecomment-2810863393 > OptimizerConfig - this one is strange - it copies some things from ExecutionProps instead of just using it directly. Either this weirdness needs to be expanded or OptimizerConfig nee

Re: [PR] feat: Add ConfigOptions to ScalarFunctionArgs [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #13527: URL: https://github.com/apache/datafusion/pull/13527#issuecomment-2810864566 BTW how many fields from ConfigOptions really need to be copied? Can we just add the ones you need for spark? Or do we need a huge pile of them? -- This is an automated message from

Re: [PR] Support some of pipe operators [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
simonvandel commented on PR #1759: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1759#issuecomment-2810341974 Hi @iffyio I'm sorry for the long response time. I have now revised the code according to your comments. Adding more tests found some bugs, so I had to ch

Re: [I] Optimized spill file format [datafusion]

2025-04-16 Thread via GitHub
alamb commented on issue #14078: URL: https://github.com/apache/datafusion/issues/14078#issuecomment-2810394103 > The tricky part to implement is array encoding like REE or bit-packing for integer arrays. Maybe we can find some reusable code in Arrow Parquet writer implementation or use som

Re: [PR] Add support for `PRINT` statement for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1811: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1811#discussion_r2047549413 ## tests/sqlparser_mssql.rs: ## @@ -2053,3 +2053,37 @@ fn parse_drop_trigger() { } ); } + +#[test] +fn parse_print() { +let print_

Re: [PR] feat: add `with_group_indices_order_mode` function for `GroupsAccumulator` to help create specialized impl [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15022: URL: https://github.com/apache/datafusion/pull/15022#issuecomment-2810401697 🤖 `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking) Running Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_

[PR] Final release note touchups [datafusion]

2025-04-16 Thread via GitHub
alamb opened a new pull request, #15741: URL: https://github.com/apache/datafusion/pull/15741 ## Which issue does this PR close? - part of https://github.com/apache/datafusion/pull/15740 - Forward port of https://github.com/apache/datafusion/pull/15740 ## Rationale f

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047762830 ## tests/sqlparser_mssql.rs: ## @@ -187,6 +188,386 @@ fn parse_mssql_create_procedure() { let _ = ms().verified_stmt("CREATE PROCEDURE [foo] AS

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047765609 ## src/parser/mod.rs: ## @@ -15017,6 +15075,13 @@ impl<'a> Parser<'a> { } } +fn parse_return(&mut self) -> Result { +let

Re: [PR] Add slt tests for `datafusion.execution.parquet.coerce_int96` setting [datafusion]

2025-04-16 Thread via GitHub
Omega359 commented on PR #15723: URL: https://github.com/apache/datafusion/pull/15723#issuecomment-2810768679 lgtm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047770271 ## src/ast/mod.rs: ## @@ -9211,6 +9253,41 @@ pub enum CopyIntoSnowflakeKind { Location, } +/// Return (MsSql) +/// +/// for Functions: +/// R

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047770466 ## src/ast/mod.rs: ## @@ -2317,15 +2313,37 @@ impl fmt::Display for ConditionalStatements { } Ok(()) }

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047771506 ## tests/sqlparser_mssql.rs: ## @@ -187,6 +188,386 @@ fn parse_mssql_create_procedure() { let _ = ms().verified_stmt("CREATE PROCEDURE [foo] AS

Re: [PR] build(deps): bump tokio from 1.44.1 to 1.44.2 in /native [datafusion-comet]

2025-04-16 Thread via GitHub
dependabot[bot] closed pull request #1622: build(deps): bump tokio from 1.44.1 to 1.44.2 in /native URL: https://github.com/apache/datafusion-comet/pull/1622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] build(deps): bump tokio from 1.44.1 to 1.44.2 in /native [datafusion-comet]

2025-04-16 Thread via GitHub
dependabot[bot] commented on PR #1622: URL: https://github.com/apache/datafusion-comet/pull/1622#issuecomment-2811542017 Looks like tokio is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [I] GROUP BY constant with aggregation function and with empty input magically summons a row [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on issue #15734: URL: https://github.com/apache/datafusion/issues/15734#issuecomment-2810958031 I can take a look next week, seems like a parser issue, similar to #5221 . -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
Dandandan commented on code in PR #15380: URL: https://github.com/apache/datafusion/pull/15380#discussion_r2047540213 ## datafusion/physical-plan/src/sorts/sort.rs: ## @@ -673,29 +676,211 @@ impl ExternalSorter { return self.sort_batch_stream(batch, metrics, reserva

Re: [PR] Add `CREATE FUNCTION` support for SQL Server [datafusion-sqlparser-rs]

2025-04-16 Thread via GitHub
aharpervc commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2047853472 ## src/ast/spans.rs: ## @@ -777,11 +778,9 @@ impl Spanned for ConditionalStatements { ConditionalStatements::Sequence { statements } =>

Re: [PR] Set DataFusion runtime configurations through SQL interface [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15594: URL: https://github.com/apache/datafusion/pull/15594#issuecomment-2810565181 I merged up from main to rerun CI to be sure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] chore: Upgrade to datafusion 47.0.0-rc1 and arrow-rs 55.0.0 [datafusion-comet]

2025-04-16 Thread via GitHub
andygrove merged PR #1563: URL: https://github.com/apache/datafusion-comet/pull/1563 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Upgrade to DataFusion 47.0.0 [datafusion-comet]

2025-04-16 Thread via GitHub
andygrove closed issue #1634: Upgrade to DataFusion 47.0.0 URL: https://github.com/apache/datafusion-comet/issues/1634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Fix Infer prepare statement type tests [datafusion]

2025-04-16 Thread via GitHub
brayanjuls commented on code in PR #15743: URL: https://github.com/apache/datafusion/pull/15743#discussion_r2048130471 ## datafusion/sql/tests/sql_integration.rs: ## @@ -4673,16 +4675,17 @@ fn test_infer_types_from_predicate() { } #[test] -fn test_infer_types_from_between_pr

[PR] Fix Infer prepare statement type tests [datafusion]

2025-04-16 Thread via GitHub
brayanjuls opened a new pull request, #15743: URL: https://github.com/apache/datafusion/pull/15743 ## Which issue does this PR close? - Closes #15577 ## Rationale for this change This intent to correct issues on the prepare statements infer type tests. Currently they are

Re: [I] Set DataFusion runtime configurations through SQL interface [datafusion]

2025-04-16 Thread via GitHub
2010YOUY01 closed issue #15552: Set DataFusion runtime configurations through SQL interface URL: https://github.com/apache/datafusion/issues/15552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Cascaded spill merge and re-spill [datafusion]

2025-04-16 Thread via GitHub
2010YOUY01 commented on PR #15610: URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2811652139 > Thank you, can you please take the fuzz test that I created in my pr and add it to yours, making sure it will pass (it will require you updating `row_hash.rs` file Those

Re: [PR] Improve `simplify_expressions` rule [datafusion]

2025-04-16 Thread via GitHub
xudong963 commented on code in PR #15735: URL: https://github.com/apache/datafusion/pull/15735#discussion_r2048178156 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -188,7 +188,7 @@ impl ExprSimplifier { /// assert_eq!(expr, b_lt_2); /// ```

Re: [PR] Support `Accumulator` for avg duration [datafusion]

2025-04-16 Thread via GitHub
shruti2522 commented on PR #15468: URL: https://github.com/apache/datafusion/pull/15468#issuecomment-2811840204 > I think you need to update the signature of Avg to support the new type as well @alamb this one's ready for review whenever you get a chance. -- This is an automated me

Re: [PR] feat: enhance-CLI-query-header-for-cast-expressions-with-literals [datafusion]

2025-04-16 Thread via GitHub
qstommyshu commented on code in PR #15736: URL: https://github.com/apache/datafusion/pull/15736#discussion_r2048003662 ## datafusion/sql/src/parser.rs: ## @@ -469,9 +473,101 @@ impl<'a> DFParser<'a> { } _ => { //

Re: [PR] refactor!: consistent null handling in coercible signatures [datafusion]

2025-04-16 Thread via GitHub
alamb commented on PR #15404: URL: https://github.com/apache/datafusion/pull/15404#issuecomment-2810861213 Thanks @alan910127 -- I'll check it out in a few days -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
2010YOUY01 commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2811725054 I really like this idea, I find this approach will not conflict with several future optimizations in my mind: - Use row format for sorting and reuse converted `Row`s for SPM, th

Re: [PR] chore(deps): bump indexmap from 2.8.0 to 2.9.0 [datafusion]

2025-04-16 Thread via GitHub
xudong963 merged PR #15732: URL: https://github.com/apache/datafusion/pull/15732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Perf: Support automatically concat_batches for sort which will improve performance [datafusion]

2025-04-16 Thread via GitHub
Dandandan commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2808612984 I wonder if we can make a more "simple" change for now: * `concat` regresses because it copies the _all columns_ of the recordbatch before sorting. * We can concat the sorting

Re: [PR] [WIP] docs: Add instructions on running TPC-H on macOS [datafusion-comet]

2025-04-16 Thread via GitHub
wForget commented on code in PR #1647: URL: https://github.com/apache/datafusion-comet/pull/1647#discussion_r2046390607 ## docs/source/contributor-guide/benchmarking_macos.md: ## @@ -0,0 +1,136 @@ + + +# Comet Benchmarking on macOS + +This guide is for setting up TPC-H benchmark

Re: [PR] test: add fuzz test for doing aggregation with larger than memory groups and sorting with limited memory [datafusion]

2025-04-16 Thread via GitHub
Rachelint commented on PR #15727: URL: https://github.com/apache/datafusion/pull/15727#issuecomment-2808699841 It is really a good coverage improvment for aggr testing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] test: add fuzz test for doing aggregation with larger than memory groups and sorting with limited memory [datafusion]

2025-04-16 Thread via GitHub
Rachelint commented on PR #15727: URL: https://github.com/apache/datafusion/pull/15727#issuecomment-2808699865 It is really a good coverage improvment for aggr testing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

  1   2   >