Re: [I] Add `CoalesceBatchesExec` to `NestedLoopJoinExec` [datafusion]

2025-06-10 Thread via GitHub
jonathanc-n commented on issue #16328: URL: https://github.com/apache/datafusion/issues/16328#issuecomment-2958096560 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat: support FixedSizeList for array_has [datafusion]

2025-06-10 Thread via GitHub
chenkovsky commented on code in PR #16333: URL: https://github.com/apache/datafusion/pull/16333#discussion_r2137107980 ## datafusion/functions-nested/src/array_has.rs: ## @@ -232,98 +236,244 @@ fn array_has_inner_for_array(haystack: &ArrayRef, needle: &ArrayRef) -> Result array

[I] `Span` for `Expr::Case` does not include the heading and trailing keywords [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
eliaperantoni opened a new issue, #1878: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1878 ### How to reproduce 1. Build a `Expr::Case` expression, e.g.: ```sql CASE col1 WHEN col2 THEN col3 ELSE col4 END ``` 2. Call `Spanned:span` on it. You

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on PR #16322: URL: https://github.com/apache/datafusion/pull/16322#issuecomment-2958304164 Waiting for benchmarks results here so I have some time to write up my assessment of what was happening and what has changed. This is just to assist any reviewers, not to replace re

[PR] chore(deps): bump clap from 4.5.39 to 4.5.40 [datafusion]

2025-06-10 Thread via GitHub
dependabot[bot] opened a new pull request, #16354: URL: https://github.com/apache/datafusion/pull/16354 Bumps [clap](https://github.com/clap-rs/clap) from 4.5.39 to 4.5.40. Changelog Sourced from https://github.com/clap-rs/clap/blob/master/CHANGELOG.md";>clap's changelog. [4

[I] [Epic] Pipeline breaking cancellation support [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas opened a new issue, #16353: URL: https://github.com/apache/datafusion/issues/16353 ### Is your feature request related to a problem or challenge? We have done the first step in https://github.com/apache/datafusion/pull/16196 for pipeline breaking cancellation support, th

Re: [I] `Span` for `Expr::Case` does not include the heading and trailing keywords [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
eliaperantoni closed issue #1878: `Span` for `Expr::Case` does not include the heading and trailing keywords URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1878 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [I] `Span` for `Expr::Case` does not include the heading and trailing keywords [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
eliaperantoni commented on issue #1878: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1878#issuecomment-2957881999 Fixed in #1874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Fix array_agg memory over accounting [datafusion]

2025-06-10 Thread via GitHub
gabotechs commented on code in PR #16346: URL: https://github.com/apache/datafusion/pull/16346#discussion_r2137044002 ## datafusion/common/src/scalar/mod.rs: ## @@ -3525,6 +3525,12 @@ impl ScalarValue { } } } + +/// Compacts ([ScalarValue::compact]

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137392057 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspon

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137392057 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspon

Re: [I] Optimize performance of `ByteViewGroupValueBuilder` on batches with inlined views [datafusion]

2025-06-10 Thread via GitHub
Rachelint commented on issue #16330: URL: https://github.com/apache/datafusion/issues/16330#issuecomment-2958350276 Seems interesting, plan to try it tonight. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137384940 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspo

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137438301 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspon

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137438301 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspon

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137439468 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspo

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137474689 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspo

Re: [PR] bug: remove busy-wait while sort is ongoing [datafusion]

2025-06-10 Thread via GitHub
pepijnve commented on code in PR #16322: URL: https://github.com/apache/datafusion/pull/16322#discussion_r2137478983 ## datafusion/physical-plan/src/sorts/merge.rs: ## @@ -216,36 +212,50 @@ impl SortPreservingMergeStream { // Once all partitions have set their correspon

Re: [PR] chore: Replace archived actions-rs/install action [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
alamb commented on code in PR #1876: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1876#discussion_r2138936424 ## .github/workflows/rust.yml: ## @@ -85,11 +88,8 @@ jobs: uses: ./.github/actions/setup-builder with: rust-version: ${{ matrix.ru

Re: [PR] chore: Replace archived actions-rs/install action [datafusion-sqlparser-rs]

2025-06-10 Thread via GitHub
alamb merged PR #1876: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] [Epic] Pipeline breaking cancellation support and improvement [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16353: URL: https://github.com/apache/datafusion/issues/16353#issuecomment-2960837514 Thank you @zhuqi-lucas -- I also added this as a wishlist item for https://github.com/apache/datafusion/issues/16235 -- This is an automated message from the Apache Git Service

Re: [PR] Add note in upgrade guide about changes to `Expr::Scalar` in 48.0.0 [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16360: URL: https://github.com/apache/datafusion/pull/16360#discussion_r2138947851 ## docs/source/library-user-guide/upgrading.md: ## @@ -200,7 +235,7 @@ working but no one knows due to lack of test coverage). [api deprecation guidelines]: http

[PR] Add note in upgrade guide about changes to `Expr::Scalar` in 48.0.0 [datafusion]

2025-06-10 Thread via GitHub
alamb opened a new pull request, #16360: URL: https://github.com/apache/datafusion/pull/16360 ## Which issue does this PR close? - Follow on to https://github.com/apache/datafusion/pull/16170 ## Rationale for this change I hit another required change while testing the delta-rs up

Re: [PR] feat: add metadata to literal expressions [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16170: URL: https://github.com/apache/datafusion/pull/16170#issuecomment-2960852408 I also made a PR to add a note to the upgrade guide here: - https://github.com/apache/datafusion/pull/16360 -- This is an automated message from the Apache Git Service. To respond

Re: [I] [EPIC] Improve shuffle performance [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on issue #1123: URL: https://github.com/apache/datafusion-comet/issues/1123#issuecomment-2960359323 I'm going to go ahead and close this now that almost all of the issues linked to this epic have been implemented. We have definitely seen a significant improvement in shu

Re: [I] [EPIC] Improve shuffle performance [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed issue #1123: [EPIC] Improve shuffle performance URL: https://github.com/apache/datafusion-comet/issues/1123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] fix: Fix SparkSha2 to be compliant with Spark response and add support for Int32 [datafusion]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #16350: URL: https://github.com/apache/datafusion/pull/16350#discussion_r2138620599 ## datafusion/spark/src/function/math/hex.rs: ## @@ -192,7 +195,7 @@ pub fn spark_hex(args: &[ColumnarValue]) -> Result

Re: [I] Optimize `NestedLoopJoinExec` Memory Usage [datafusion]

2025-06-10 Thread via GitHub
UBarney commented on issue #16364: URL: https://github.com/apache/datafusion/issues/16364#issuecomment-2961161122 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Document `copy_array_data` function with example [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16361: URL: https://github.com/apache/datafusion/pull/16361#issuecomment-2961165506 Thank you for the review @2010YOUY01 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [PR] Encapsulate metadata for literals on to a `FieldMetadata` structure [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16317: URL: https://github.com/apache/datafusion/pull/16317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] to_hex cannot take UInt64 [datafusion]

2025-06-10 Thread via GitHub
alamb closed issue #16327: to_hex cannot take UInt64 URL: https://github.com/apache/datafusion/issues/16327 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [PR] Add support `UInt64` and other integer data types for `to_hex` [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16335: URL: https://github.com/apache/datafusion/pull/16335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961173895 FWIW @phillipleblanc also hit this as well in SpiceAI. See this for more details - https://github.com/spiceai/spiceai/pull/6178 -- This is an automated message from the Apache Gi

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2961176597 👀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Add support `UInt64` and other integer data types for `to_hex` [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16335: URL: https://github.com/apache/datafusion/pull/16335#issuecomment-2961170037 Thanks @tlm365 and @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat: mapping sql Char/Text/String default to Utf8View [datafusion]

2025-06-10 Thread via GitHub
zhuqi-lucas commented on code in PR #16290: URL: https://github.com/apache/datafusion/pull/16290#discussion_r2139142869 ## datafusion/sqllogictest/test_files/arrow_files.slt: ## @@ -61,22 +61,12 @@ LOCATION '../core/tests/data/partitioned_table_arrow/' PARTITIONED BY (part);

Re: [I] Request to update crates.io ownership [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16323: URL: https://github.com/apache/datafusion/issues/16323#issuecomment-2961181201 I am sorry @xudong963 -- there were apparently many other crates I didn't add you as an owner for. I have just sent a bunch more invitations -- hopefully after you accept them th

[PR] chore: Stop Running Spark SQL tests for Spark 3.5.4 and 3.5.5 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove opened a new pull request, #1870: URL: https://github.com/apache/datafusion-comet/pull/1870 ## Which issue does this PR close? N/A ## Rationale for this change Reduce developer overhead of keeping multiple diff files up-to-date for Spark 3.5 and

Re: [PR] chore: Skip some Spark SQL test runs on PRs [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1868: chore: Skip some Spark SQL test runs on PRs URL: https://github.com/apache/datafusion-comet/pull/1868 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [D] Search Pushdown (e.g. Vector Search) Into Table Providers [datafusion]

2025-06-10 Thread via GitHub
GitHub user backkem added a comment to the discussion: Search Pushdown (e.g. Vector Search) Into Table Providers FWIW there is an existing implementation of the "full logical plan pushdown" in [datafusion-contrib/datafusion-federation](https://github.com/datafusion-contrib/datafusion-federatio

Re: [PR] [PoC] Add API for tracking distinct arrays in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on PR #16359: URL: https://github.com/apache/datafusion/pull/16359#issuecomment-2960579460 Hm - `.slice()` of course creates another `Arc`, so we actually have to use the buffers `Arc` rather than arrays. -- This is an automated message from the Apache Git Service. To

Re: [PR] feat: Only create one native plan for a query on an executor [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1203: URL: https://github.com/apache/datafusion-comet/pull/1203#issuecomment-2960622532 I am closing this PR since it has not been active lately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] feat: Only create one native plan for a query on an executor [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1203: feat: Only create one native plan for a query on an executor URL: https://github.com/apache/datafusion-comet/pull/1203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #616: URL: https://github.com/apache/datafusion-comet/pull/616#issuecomment-2960624710 I am closing this PR since it has not been active lately. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] chore: add a utils method to getColumnReader with SQLConf [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #360: URL: https://github.com/apache/datafusion-comet/pull/360#issuecomment-2960623336 @huaxingao Is this PR still needed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 [datafusion-comet]

2025-06-10 Thread via GitHub
dependabot[bot] commented on PR #954: URL: https://github.com/apache/datafusion-comet/pull/954#issuecomment-2960626032 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor versi

Re: [PR] build: Upgrade Spark 4.0 to preview2 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #955: URL: https://github.com/apache/datafusion-comet/pull/955#issuecomment-2960626602 I am closing this PR since it has not been active lately. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] build: Upgrade Spark 4.0 to preview2 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #955: build: Upgrade Spark 4.0 to preview2 URL: https://github.com/apache/datafusion-comet/pull/955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #954: build(deps): bump com.google.protobuf:protobuf-java from 3.19.6 to 3.25.5 URL: https://github.com/apache/datafusion-comet/pull/954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] chore: Override node name for CometSparkToColumnar [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #958: URL: https://github.com/apache/datafusion-comet/pull/958#issuecomment-2960628775 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] chore: Override node name for CometSparkToColumnar [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #958: chore: Override node name for CometSparkToColumnar URL: https://github.com/apache/datafusion-comet/pull/958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Document `copy_array_data` function with example [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 commented on code in PR #16361: URL: https://github.com/apache/datafusion/pull/16361#discussion_r2139062181 ## datafusion/common/src/scalar/mod.rs: ## @@ -3527,6 +3527,33 @@ impl ScalarValue { } } +/// Compacts the data of an `ArrayData` into a new `ArrayData`

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
xudong963 commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961047647 Fyi, we've upgraded to DF47 with the fix successfully. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
adriangb commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961049931 Great to hear! Sorry for the inconvenience... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] docs: Expand `MemoryPool` docs with related structs [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 merged PR #16289: URL: https://github.com/apache/datafusion/pull/16289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] [PoC] Add API for tracking distinct buffers in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
2010YOUY01 commented on code in PR #16359: URL: https://github.com/apache/datafusion/pull/16359#discussion_r2139040880 ## datafusion/execution/src/memory_pool/mod.rs: ## @@ -131,14 +133,58 @@ pub trait MemoryPool: Send + Sync + std::fmt::Debug { /// This must always succeed

Re: [PR] Update parser recursion limit from 50 to 100 [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15622: URL: https://github.com/apache/datafusion/pull/15622#issuecomment-2961018161 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Revert use file schema in parquet pruning [datafusion]

2025-06-10 Thread via GitHub
xudong963 commented on PR #16086: URL: https://github.com/apache/datafusion/pull/16086#issuecomment-2961055915 > Great to hear! Sorry for the inconvenience... No problem! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] fix: Fix SparkSha2 to be compliant with Spark response and add support for Int32 [datafusion]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #16350: URL: https://github.com/apache/datafusion/pull/16350#discussion_r2138607063 ## datafusion/spark/src/function/math/hex.rs: ## @@ -192,7 +195,7 @@ pub fn spark_hex(args: &[ColumnarValue]) -> Result

[PR] chore: Enable more Spark SQL tests (issue #231) [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove opened a new pull request, #1869: URL: https://github.com/apache/datafusion-comet/pull/1869 ## Which issue does this PR close? Closes #231 ## Rationale for this change Enable more tests ## What changes are included in this PR?

[PR] Add API for tracking distinct arrays [datafusion]

2025-06-10 Thread via GitHub
Dandandan opened a new pull request, #16359: URL: https://github.com/apache/datafusion/pull/16359 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] chore(deps): bump syn from 2.0.101 to 2.0.102 [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16355: URL: https://github.com/apache/datafusion/pull/16355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore(deps): bump clap from 4.5.39 to 4.5.40 [datafusion]

2025-06-10 Thread via GitHub
alamb merged PR #16354: URL: https://github.com/apache/datafusion/pull/16354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[I] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-06-10 Thread via GitHub
lorenarosati opened a new issue, #16363: URL: https://github.com/apache/datafusion/issues/16363 ### Describe the bug When we convert a Substrait plan (which includes a virtual table) to a DataFusion LogicalPlan using the `from_substrait_plan()` [function](https://github.com/apache/da

Re: [I] Ensure Substrait consumer can handle expressions in VirtualTable [datafusion]

2025-06-10 Thread via GitHub
lorenarosati commented on issue #16363: URL: https://github.com/apache/datafusion/issues/16363#issuecomment-2961113864 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: Enable more Spark SQL tests [datafusion-comet]

2025-06-10 Thread via GitHub
codecov-commenter commented on PR #1869: URL: https://github.com/apache/datafusion-comet/pull/1869#issuecomment-2960381202 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1869?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138821450 ## native/core/src/execution/shuffle/range_partitioner.rs: ## @@ -0,0 +1,432 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138822721 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2904,6 +2903,8 @@ object QueryPlanSerde extends Logging with CometExprShim {

[I] Optimize `NestedLoopJoinExec` Memory Usage [datafusion]

2025-06-10 Thread via GitHub
UBarney opened a new issue, #16364: URL: https://github.com/apache/datafusion/issues/16364 ### Is your feature request related to a problem or challenge? The current Nested Loop Join implementation follows this simplified logic: 1. Buffer the Build Side: All data from the left (buil

Re: [PR] feat: Parquet modular encryption [datafusion]

2025-06-10 Thread via GitHub
alamb commented on code in PR #16351: URL: https://github.com/apache/datafusion/pull/16351#discussion_r2139135378 ## datafusion/common/src/config.rs: ## @@ -591,6 +930,12 @@ config_namespace! { /// writing out already in-memory data, such as from a cached /// d

Re: [PR] Perf: load default Utf8View for CSV datatype [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #16243: URL: https://github.com/apache/datafusion/pull/16243#issuecomment-2961158371 Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is read

Re: [PR] fix: support read Struct by user schema [datafusion-comet]

2025-06-10 Thread via GitHub
comphead commented on code in PR #1860: URL: https://github.com/apache/datafusion-comet/pull/1860#discussion_r2139136916 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -2728,4 +2728,35 @@ class CometExpressionSuite extends CometTestBase with Adaptive

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#issuecomment-2960629920 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] feat: Implement ANSI support for Round [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #989: feat: Implement ANSI support for Round URL: https://github.com/apache/datafusion-comet/pull/989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Add ANSI support for Add, Subtract & Multiply [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #1135: URL: https://github.com/apache/datafusion-comet/pull/1135#issuecomment-2960630950 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] feat: Implement ANSI support for Round [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on PR #989: URL: https://github.com/apache/datafusion-comet/pull/989#issuecomment-2960629321 I am closing this PR since it is no longer active. Feel free to re-open if needed. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Add ANSI support for Add, Subtract & Multiply [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1135: Add ANSI support for Add, Subtract & Multiply URL: https://github.com/apache/datafusion-comet/pull/1135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1122: feat: add expression array_size URL: https://github.com/apache/datafusion-comet/pull/1122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] feat: Add aggregate expression fuzz testing in CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #1374: feat: Add aggregate expression fuzz testing in CI URL: https://github.com/apache/datafusion-comet/pull/1374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] chore: Stop Running Spark SQL tests for Spark 3.5.4 and 3.5.5 [datafusion-comet]

2025-06-10 Thread via GitHub
codecov-commenter commented on PR #1870: URL: https://github.com/apache/datafusion-comet/pull/1870#issuecomment-2960634456 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1870?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] feat: ANSI support for Add [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed pull request #616: feat: ANSI support for Add URL: https://github.com/apache/datafusion-comet/pull/616 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] feat: support RangePartitioning with native shuffle [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on code in PR #1862: URL: https://github.com/apache/datafusion-comet/pull/1862#discussion_r2138821947 ## native/core/src/execution/shuffle/range_partitioner.rs: ## @@ -0,0 +1,432 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

Re: [PR] chore: refactor Substrait consumer's "rename_field" and implement the rest of types [datafusion]

2025-06-10 Thread via GitHub
westonpace commented on code in PR #16345: URL: https://github.com/apache/datafusion/pull/16345#discussion_r2138834078 ## datafusion/substrait/src/logical_plan/consumer/utils.rs: ## @@ -81,98 +81,167 @@ pub(super) fn next_struct_field_name( } } -pub(super) fn rename_fiel

Re: [I] Add fuzz testing to CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove closed issue #1373: Add fuzz testing to CI URL: https://github.com/apache/datafusion-comet/issues/1373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Add fuzz testing to CI [datafusion-comet]

2025-06-10 Thread via GitHub
andygrove commented on issue #1373: URL: https://github.com/apache/datafusion-comet/issues/1373#issuecomment-2960683018 We do now have a `CometFuzzSuite` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[PR] Simplify predicates in filter [datafusion]

2025-06-10 Thread via GitHub
xudong963 opened a new pull request, #16362: URL: https://github.com/apache/datafusion/pull/16362 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [PR] WIP: Test enabling Parquet filter pushdown with parquet caching page cache reader [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15506: URL: https://github.com/apache/datafusion/pull/15506#issuecomment-2961018300 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat(sql): add diagnostic for wrong number of function arguments [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15490: URL: https://github.com/apache/datafusion/pull/15490#issuecomment-2961018338 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] docs: add conventional commit guide and PR title examples [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15638: URL: https://github.com/apache/datafusion/pull/15638#issuecomment-2961018128 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] chore: move `optimize_subquery_sort` into optimizer [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15441: URL: https://github.com/apache/datafusion/pull/15441#issuecomment-2961018385 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] fix: union all by name [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15603: URL: https://github.com/apache/datafusion/pull/15603#issuecomment-2961018203 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] WIP: Aggregate UDF FFI [datafusion]

2025-06-10 Thread via GitHub
github-actions[bot] commented on PR #15510: URL: https://github.com/apache/datafusion/pull/15510#issuecomment-2961018258 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] chore: refactor Substrait consumer's "rename_field" and implement the rest of types [datafusion]

2025-06-10 Thread via GitHub
gabotechs commented on code in PR #16345: URL: https://github.com/apache/datafusion/pull/16345#discussion_r2139335258 ## datafusion/substrait/src/logical_plan/consumer/utils.rs: ## @@ -81,98 +81,167 @@ pub(super) fn next_struct_field_name( } } -pub(super) fn rename_field

Re: [PR] Re-Add CodeCov [datafusion]

2025-06-10 Thread via GitHub
alamb commented on PR #15256: URL: https://github.com/apache/datafusion/pull/15256#issuecomment-2961234354 I am trying to clean up the review queue and it wasn't clear to me what the plan for this PR is, so marking it as draft -- This is an automated message from the Apache Git Service. T

Re: [PR] fix: Fix SparkSha2 to be compliant with Spark response and add support for Int32 [datafusion]

2025-06-10 Thread via GitHub
rishvin commented on PR #16350: URL: https://github.com/apache/datafusion/pull/16350#issuecomment-2961232868 Thanks @andygrove / @getChan for the feedback. Please review the changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] DF 48 upgrade guide missing window function breaking change [datafusion]

2025-06-10 Thread via GitHub
alamb commented on issue #16326: URL: https://github.com/apache/datafusion/issues/16326#issuecomment-2961213135 I believe this was fixed in - https://github.com/apache/datafusion/pull/16313 -- This is an automated message from the Apache Git Service. To respond to the message, please l

[I] Improve performance of `datafusion-cli` when reading from remote storage [datafusion]

2025-06-10 Thread via GitHub
alamb opened a new issue, #16365: URL: https://github.com/apache/datafusion/issues/16365 ### Is your feature request related to a problem or challenge? - Part of https://github.com/apache/datafusion/pull/16300/files While testing https://github.com/apache/datafusion/pull/16300,

Re: [PR] [PoC] Add API for tracking distinct buffers in `MemoryPool` by reference count [datafusion]

2025-06-10 Thread via GitHub
Dandandan commented on code in PR #16359: URL: https://github.com/apache/datafusion/pull/16359#discussion_r2139196390 ## datafusion/execution/src/memory_pool/mod.rs: ## @@ -131,14 +133,58 @@ pub trait MemoryPool: Send + Sync + std::fmt::Debug { /// This must always succeed

  1   2   >