[PR] Add support for `RAISE` statement [datafusion-sqlparser-rs]

2025-03-17 Thread via GitHub
iffyio opened a new pull request, #1766: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1766 ```sql RAISE USING MESSAGE = 'error'; ``` [BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/procedural-language#raise) [Snowflake](https://doc

Re: [D] Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? [datafusion]

2025-03-17 Thread via GitHub
GitHub user alamb edited a comment on the discussion: Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? You could create a function called `jsonb_path_exists` that takes a binary column and a json path string perhaps? You could take a look at how it is done

Re: [D] Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? [datafusion]

2025-03-17 Thread via GitHub
GitHub user alamb added a comment to the discussion: Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? You could create a function called `jsonb_path_exists` that takes a binary column and a json path string perhaps? You could take a look at how it is done h

Re: [I] [Epic] A collection of items related to processing larger than memory datasets (via spilling, externalized algorithm, etc) [datafusion]

2025-03-17 Thread via GitHub
alamb closed issue #14077: [Epic] A collection of items related to processing larger than memory datasets (via spilling, externalized algorithm, etc) URL: https://github.com/apache/datafusion/issues/14077 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [I] [Epic] A collection of items related to processing larger than memory datasets (via spilling, externalized algorithm, etc) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #14077: URL: https://github.com/apache/datafusion/issues/14077#issuecomment-2729497623 I broke this ticket into two follow on ones, one focused on hashing and one focused on sorting: - https://github.com/apache/datafusion/issues/15271 - https://github.com/apach

Re: [I] [Epic] A collection of items related to processing larger than memory datasets (via spilling, externalized algorithm, etc) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #14077: URL: https://github.com/apache/datafusion/issues/14077#issuecomment-2729483092 Since there was recent excitement /activity about better sorting behavior, I file an EPIC for just that: - https://github.com/apache/datafusion/issues/14692 -- This is an au

Re: [PR] Support bounds evaluation for temporal data types [datafusion]

2025-03-17 Thread via GitHub
ch-sc commented on PR #14523: URL: https://github.com/apache/datafusion/pull/14523#issuecomment-2728490058 Hi @berkaysynnada, I should be able to spend some time on this at the end of this week. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Weekly Plan (Andrew Lamb) March 10, 2025 [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15121: URL: https://github.com/apache/datafusion/issues/15121#issuecomment-2729841089 - Next week: https://github.com/apache/datafusion/issues/15274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add upgrade notes for array signatures [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15237: URL: https://github.com/apache/datafusion/pull/15237#issuecomment-2729725471 Thanks @jkosh44 -- I'll wait for another day or two to merge this one in to give time for others to respond too -- This is an automated message from the Apache Git Service. To respo

Re: [I] Weekly Plan (Andrew Lamb) March 10, 2025 [datafusion]

2025-03-17 Thread via GitHub
alamb closed issue #15121: Weekly Plan (Andrew Lamb) March 10, 2025 URL: https://github.com/apache/datafusion/issues/15121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[PR] Chore/add codecov demo 3 [datafusion]

2025-03-17 Thread via GitHub
blaginin opened a new pull request, #15273: URL: https://github.com/apache/datafusion/pull/15273 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested

Re: [I] Add some DataFrame method(s) to combine two inputs where the schema can be different [datafusion]

2025-03-17 Thread via GitHub
Omega359 commented on issue #12650: URL: https://github.com/apache/datafusion/issues/12650#issuecomment-2729820620 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Chore/add codecov demo 3 [datafusion]

2025-03-17 Thread via GitHub
blaginin closed pull request #15273: Chore/add codecov demo 3 URL: https://github.com/apache/datafusion/pull/15273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] Implement tree explain for UnionExec [datafusion]

2025-03-17 Thread via GitHub
zebsme opened a new pull request, #15278: URL: https://github.com/apache/datafusion/pull/15278 ## Which issue does this PR close? - Closes #15277 - Part of #14914 ## Rationale for this change ## What changes are included in this PR? - Implement Unio

Re: [I] A complete solution for stable and safe sort with spill [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #14692: URL: https://github.com/apache/datafusion/issues/14692#issuecomment-2729485594 I started collecting sort larger than RAM related things here: - https://github.com/apache/datafusion/issues/15271 -- This is an automated message from the Apache Git Service

Re: [I] Generate the common SQL for the unparsing result of the unnest [datafusion]

2025-03-17 Thread via GitHub
blaginin commented on issue #15233: URL: https://github.com/apache/datafusion/issues/15233#issuecomment-2730414093 > I think @blaginin's work https://github.com/apache/datafusion/pull/14781 pretty matches this approach (https://github.com/apache/datafusion/pull/14781#discussion_r1991936799)

Re: [I] Analysis to support`SortPreservingMerge` --> `ProgressiveEval` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15191: URL: https://github.com/apache/datafusion/issues/15191#issuecomment-2730417397 > Incidentally this is the use case I am targeting. Anyway, this query would result in at least 3 partitions, two of which are overlapping in time. If we could generalize Progress

Re: [I] Analysis to support`SortPreservingMerge` --> `ProgressiveEval` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15191: URL: https://github.com/apache/datafusion/issues/15191#issuecomment-2730420780 > I don't mean to tout my own horn too much, but in fact this exact use case is what [FileScanConfig::split_groups_by_statistics](https://github.com/apache/datafusion/blob/main/da

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #13815: URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2730426380 > I’m very interested in contributing to the WASM support for DataFusion project as part of GSoC 2025. Enhancing embeddability and ensuring robust WASM integration aligns with my

Re: [PR] fix: Refactor CometScanRule and fix bugs [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on code in PR #1483: URL: https://github.com/apache/datafusion-comet/pull/1483#discussion_r1999359044 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -188,69 +185,62 @@ class CometSparkSessionExtensions sc

[PR] chore: Enable Spark SQL tests for native_iceberg_compat [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new pull request, #1541: URL: https://github.com/apache/datafusion-comet/pull/1541 ## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/1489 ## Rationale for this change ## What changes are include

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-17 Thread via GitHub
matthewmturner commented on issue #13815: URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2730457784 > 4. Stretch goal -- prototype how we would support WASM user defined functions On this point, I was able to get WASM UDFs working in [dft](https://github.com/data

Re: [PR] chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) [datafusion]

2025-03-17 Thread via GitHub
alamb merged PR #15123: URL: https://github.com/apache/datafusion/pull/15123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Deprecate and eventually remove `ScalarUDF::invoke_batch` [datafusion]

2025-03-17 Thread via GitHub
alamb closed issue #14652: Deprecate and eventually remove `ScalarUDF::invoke_batch` URL: https://github.com/apache/datafusion/issues/14652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15123: URL: https://github.com/apache/datafusion/pull/15123#issuecomment-2729560622 Thanks again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-17 Thread via GitHub
Spaarsh commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2729797059 @timsaucer there are still several rules to be enabled. But those require significant changes. Take a look at the rule `PLR0913`: ``` docs/source/conf.py:76:5: PLR0913 Too

Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-17 Thread via GitHub
chenkovsky commented on code in PR #15212: URL: https://github.com/apache/datafusion/pull/15212#discussion_r1999031362 ## datafusion/sql/src/unparser/expr.rs: ## @@ -94,6 +94,7 @@ impl Unparser<'_> { Ok(root_expr) } +#[cfg_attr(feature = "recursive_protection

Re: [PR] Improve feature flag CI coverage `datafusion` and `datafusion-functions` [datafusion]

2025-03-17 Thread via GitHub
alamb merged PR #15203: URL: https://github.com/apache/datafusion/pull/15203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
xudong963 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r1999040059 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,45 +465,103 @@ impl FileFormat for ParquetFormat { } } -/// Coerces the file schema if th

Re: [I] Improved CI test coverage for rust features [datafusion]

2025-03-17 Thread via GitHub
alamb closed issue #15155: Improved CI test coverage for rust features URL: https://github.com/apache/datafusion/issues/15155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] refactor: Move view and stream from `datasource` to `catalog` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15260: URL: https://github.com/apache/datafusion/pull/15260#discussion_r1999228102 ## datafusion/catalog/Cargo.toml: ## @@ -35,17 +35,18 @@ arrow = { workspace = true } async-trait = { workspace = true } dashmap = { workspace = true } datafusion

Re: [PR] feat: Native support utf8view for regex string operators [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15275: URL: https://github.com/apache/datafusion/pull/15275#discussion_r1999247151 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -1177,26 +1177,6 @@ pub fn string_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option

Re: [I] Cannot build a fresh checkout, configure_me has yanked deps [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm closed issue #1042: Cannot build a fresh checkout, configure_me has yanked deps URL: https://github.com/apache/datafusion-ballista/issues/1042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] Cannot build a fresh checkout, configure_me has yanked deps [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #1042: URL: https://github.com/apache/datafusion-ballista/issues/1042#issuecomment-2730896362 > can we close this issue, please? it does not look as ballista problem closing this issue -- This is an automated message from the Apache Git Service. To r

Re: [PR] chore: Enable Spark SQL tests for native_datafusion [datafusion-comet]

2025-03-17 Thread via GitHub
codecov-commenter commented on PR #1543: URL: https://github.com/apache/datafusion-comet/pull/1543#issuecomment-2730917939 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1543?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Lack of images in docker hub [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1044: Lack of images in docker hub URL: https://github.com/apache/datafusion-ballista/issues/1044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] feat: publish docker containers for executor and scheduler [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1200: URL: https://github.com/apache/datafusion-ballista/pull/1200 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] chore: update python dependencies [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1197: URL: https://github.com/apache/datafusion-ballista/pull/1197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] doc: update docker related documentation [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1204: URL: https://github.com/apache/datafusion-ballista/pull/1204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] `Starting a Ballista Cluster using Docker` documentation is incorrect [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1198: `Starting a Ballista Cluster using Docker` documentation is incorrect URL: https://github.com/apache/datafusion-ballista/issues/1198 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Only unnest source for `EmptyRelation` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15159: URL: https://github.com/apache/datafusion/pull/15159#discussion_r1999670066 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## Review Comment: I don't understand this comment -- doesn't this PR *add* a new test? ## datafusion/s

[PR] chore(deps): bump ring from 0.17.8 to 0.17.14 in /python [datafusion-ballista]

2025-03-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1206: URL: https://github.com/apache/datafusion-ballista/pull/1206 Bumps [ring](https://github.com/briansmith/ring) from 0.17.8 to 0.17.14. Changelog Sourced from https://github.com/briansmith/ring/blob/main/RELEASES.md";>ring's changelog

[I] Spark SQL test failures in native_datafusion scan [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new issue, #1545: URL: https://github.com/apache/datafusion-comet/issues/1545 ### Describe the bug There are more than 100 failures in `core2` suite that seem to have the same root cause: ``` - Spark vectorized reader - without partition data column - sele

[PR] doc: remove arrow from doc title [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm opened a new pull request, #1207: URL: https://github.com/apache/datafusion-ballista/pull/1207 # Which issue does this PR close? Closes #1016. # Rationale for this change # What changes are included in this PR? # Are there any user-faci

Re: [PR] build: Use unique name for surefire artifacts [datafusion-comet]

2025-03-17 Thread via GitHub
codecov-commenter commented on PR #1544: URL: https://github.com/apache/datafusion-comet/pull/1544#issuecomment-2730955992 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1544?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
adriangb commented on code in PR #15284: URL: https://github.com/apache/datafusion/pull/15284#discussion_r1999690016 ## datafusion/catalog/src/session.rs: ## @@ -145,7 +145,7 @@ impl From<&dyn Session> for TaskContext { state.scalar_functions().clone(),

Re: [I] Example program gives `mismatched types` error on `ParquetReadOptions` [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm closed issue #1021: Example program gives `mismatched types` error on `ParquetReadOptions` URL: https://github.com/apache/datafusion-ballista/issues/1021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Example program gives `mismatched types` error on `ParquetReadOptions` [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #1021: URL: https://github.com/apache/datafusion-ballista/issues/1021#issuecomment-2730957226 I'm closing this issue as it was caused due to datafusion/ballista version mismatch. Ballista version signals which datafusion it supports, ballista 43 supports

Re: [PR] chore(deps): bump ring from 0.17.11 to 0.17.13 [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm merged PR #1199: URL: https://github.com/apache/datafusion-ballista/pull/1199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[PR] WIP: Test arrow-rs 54.3.0 upgrade [datafusion]

2025-03-17 Thread via GitHub
alamb opened a new pull request, #15285: URL: https://github.com/apache/datafusion/pull/15285 ## Which issue does this PR close? - part of https://github.com/apache/arrow-rs/issues/7107 ## Rationale for this change This tests with the newest version of arrow as a

[I] Add support for S3 Object Store in default binaries [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm opened a new issue, #1205: URL: https://github.com/apache/datafusion-ballista/issues/1205 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** With #1200 publishing executor and scheduler containers, there is need to su

Re: [PR] feat(datafusion-functions-aggregate): add support for lists and other nested types in `min` and `max` [datafusion]

2025-03-17 Thread via GitHub
github-actions[bot] commented on PR #13991: URL: https://github.com/apache/datafusion/pull/13991#issuecomment-2731392221 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Emit warning with attached `Diagnostic` when doing `= NULL` [datafusion]

2025-03-17 Thread via GitHub
changsun20 commented on issue #14434: URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2731465313 Hi @eliaperantoni, Thank you for the detailed guidance! Here's my understanding of the next steps: 1. **Warning Scope** I'll implement the semantic appro

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-17 Thread via GitHub
comphead commented on PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#issuecomment-2731468304 @andygrove @kazuyukitanimura please have a second look Nested arrays and Iceberg compat support will be added in follow up PR -- This is an automated message from the Apach

[PR] Improvement/improve wildcard error 15004 [datafusion]

2025-03-17 Thread via GitHub
Jiashu-Hu opened a new pull request, #15287: URL: https://github.com/apache/datafusion/pull/15287 ## Which issue does this PR close? - Closes [#15004](https://github.com/apache/datafusion/issues/15004). ## Rationale for this change The current error messages f

[PR] add LOCK operation for ALTER TABLE [datafusion-sqlparser-rs]

2025-03-17 Thread via GitHub
MohamedAbdeen21 opened a new pull request, #1768: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1768 Closes #1665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] chore(deps): bump ring from 0.17.8 to 0.17.14 in /python [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm merged PR #1206: URL: https://github.com/apache/datafusion-ballista/pull/1206 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15284: URL: https://github.com/apache/datafusion/pull/15284#discussion_r1999897889 ## datafusion/catalog/src/session.rs: ## @@ -145,7 +145,7 @@ impl From<&dyn Session> for TaskContext { state.scalar_functions().clone(), st

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2731258066 > > there is one test failing > > No idea why this test should have failed. Hoping that the rerun will fix it. If it was the surefire upload issue then the fix is

Re: [PR] Add LOCK operation for ALTER TABLE [datafusion-sqlparser-rs]

2025-03-17 Thread via GitHub
iffyio merged PR #1768: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
xudong963 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000307800 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,7 +465,116 @@ impl FileFormat for ParquetFormat { } } +/// Apply necessary schema type co

[PR] Use `any` instead of `for_each` [datafusion]

2025-03-17 Thread via GitHub
xudong963 opened a new pull request, #15289: URL: https://github.com/apache/datafusion/pull/15289 ## Which issue does this PR close? - Closes #. ## Rationale for this change Is the code essentially looking for the first indication that statistics are availabl

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1536: URL: https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2730761195 I will close this issue since the design does seem correct. We should remove the TODO though. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Expose user defined functions in the FFI [datafusion]

2025-03-17 Thread via GitHub
Dev79844 commented on issue #14562: URL: https://github.com/apache/datafusion/issues/14562#issuecomment-2731335616 Hey @timsaucer I would like to contribute. Can I take up window and table? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Require Comet 0.6 Docker image for Spark 3.5.5 [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1509: URL: https://github.com/apache/datafusion-comet/issues/1509#issuecomment-2731240978 @RaghavendraGanesh We now have a 0.7.0 image for Spark 3.5.4. Perhaps that helps? https://hub.docker.com/r/apache/datafusion-comet/tags -- This is an automated m

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
xudong963 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000109655 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,45 +465,103 @@ impl FileFormat for ParquetFormat { } } -/// Coerces the file schema if th

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000156746 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,7 +465,116 @@ impl FileFormat for ParquetFormat { } } +/// Apply necessary schema type c

Re: [PR] refactor: Move view and stream from `datasource` to `catalog` [datafusion]

2025-03-17 Thread via GitHub
logan-keede commented on code in PR #15260: URL: https://github.com/apache/datafusion/pull/15260#discussion_r158674 ## datafusion/catalog/src/lib.rs: ## @@ -46,4 +46,94 @@ pub use r#async::*; pub use schema::*; pub use session::*; pub use table::*; +pub mod stream; pub m

Re: [I] Migrate the following tests to `insta` [datafusion]

2025-03-17 Thread via GitHub
jsai28 commented on issue #15282: URL: https://github.com/apache/datafusion/issues/15282#issuecomment-2730914857 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15244: URL: https://github.com/apache/datafusion/pull/15244#issuecomment-2730389755 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1999893696 ## native/core/src/parquet/parquet_exec.rs: ## @@ -0,0 +1,139 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2731253073 > there is one test failing No idea why this test should have failed. Hoping that the rerun will fix it. -- This is an automated message from the Apache Git Service

[PR] minor: fix `data/sqlite` link [datafusion]

2025-03-17 Thread via GitHub
sdht0 opened a new pull request, #15286: URL: https://github.com/apache/datafusion/pull/15286 Just updated the link. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [I] Support duckdb's `INSERT OR IGNORE INTO ...` [datafusion]

2025-03-17 Thread via GitHub
qazxcdswe123 closed issue #14966: Support duckdb's `INSERT OR IGNORE INTO ...` URL: https://github.com/apache/datafusion/issues/14966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] fix: Refactor CometScanRule and fix bugs [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on code in PR #1483: URL: https://github.com/apache/datafusion-comet/pull/1483#discussion_r1999846503 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -188,69 +185,62 @@ class CometSparkSessionExtensions sc

Re: [PR] minor: make `graphviz-rust` dependency optional [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1203: URL: https://github.com/apache/datafusion-ballista/pull/1203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] `ScalarValue::try_from_array` holds onto input array for Struct, Lists types [datafusion]

2025-03-17 Thread via GitHub
chenkovsky commented on issue #15205: URL: https://github.com/apache/datafusion/issues/15205#issuecomment-2731509381 Hi, @Dandandan . it seems that there's no direct way to deep copy array in arrow? so I tested the following code. it seems that it works. ```rust let a = a

Re: [PR] minor: fix `data/sqlite` link [datafusion]

2025-03-17 Thread via GitHub
Weijun-H commented on code in PR #15286: URL: https://github.com/apache/datafusion/pull/15286#discussion_r2000123298 ## datafusion/sqllogictest/README.md: ## @@ -28,7 +28,7 @@ This crate is a submodule of DataFusion that contains an implementation of [sqll ## Overview This

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r110359 ## datafusion/expr/src/expr.rs: ## @@ -2607,11 +2793,23 @@ pub(crate) fn schema_name_from_exprs_comma_separated_without_space( schema_name_from_exprs_inne

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2731279083 Is it possible to modify `Display` for Expr for explain statement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r104973 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15284: URL: https://github.com/apache/datafusion/pull/15284#issuecomment-2731258754 > lgtm thanks @alamb not sure why CI failed though Thanks @comphead -- I am not sure either -- I'll figure t out -- This is an automated message from the Apache Git Service. T

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2000179936 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for SqlD

Re: [I] remove `graphviz-rust` dependency [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1202: remove `graphviz-rust` dependency URL: https://github.com/apache/datafusion-ballista/issues/1202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-17 Thread via GitHub
zhuqi-lucas commented on code in PR #15239: URL: https://github.com/apache/datafusion/pull/15239#discussion_r2000189083 ## datafusion/common/src/dfschema.rs: ## @@ -605,25 +582,49 @@ impl DFSchema { } /// Returns true if the two schemas have the same qualified named

Re: [PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-17 Thread via GitHub
zhuqi-lucas commented on PR #15239: URL: https://github.com/apache/datafusion/pull/15239#issuecomment-2731620314 > Thanks @zhuqi-lucas -- this code looks really nice 👨‍🍳 👌 > > I think we should avoid API breakages but otherwise looks great to me Thank you for review @alamb , add

Re: [I] Build failure in flight_sql.rs [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #895: URL: https://github.com/apache/datafusion-ballista/issues/895#issuecomment-2731017743 It looks like this issue has been fixed, is it ok to close this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
irenjj commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2731239713 > Thank you @irenjj -- this looks really nice > > I think there are a few ways we should improve the comments, ~but I'll directly push those to this branch.~ > > Thanks a

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove closed issue #1510: Publish official Docker images to Docker Hub under Apache account URL: https://github.com/apache/datafusion-comet/issues/1510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1510: URL: https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2731240199 Closing as complete: https://hub.docker.com/r/apache/datafusion-comet/tags -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] feat: implement scripts for binary release build [datafusion-comet]

2025-03-17 Thread via GitHub
dpengpeng commented on PR #932: URL: https://github.com/apache/datafusion-comet/pull/932#issuecomment-2731538887 @parthchandra Consulting a question: In the current compilation script `dev/release/build-release-comet.sh`, the final invocation of the compilation command is `core-amd64-libs`

Re: [PR] Add substrait tpch round trip tests from sql query [datafusion]

2025-03-17 Thread via GitHub
github-actions[bot] commented on PR #13888: URL: https://github.com/apache/datafusion/pull/13888#issuecomment-2731392253 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r107560 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for

Re: [PR] refactor: Move view and stream from `datasource` to `catalog` [datafusion]

2025-03-17 Thread via GitHub
logan-keede commented on code in PR #15260: URL: https://github.com/apache/datafusion/pull/15260#discussion_r159689 ## datafusion/catalog/Cargo.toml: ## @@ -35,17 +35,18 @@ arrow = { workspace = true } async-trait = { workspace = true } dashmap = { workspace = true } data

Re: [PR] Only unnest source for `EmptyRelation` [datafusion]

2025-03-17 Thread via GitHub
blaginin commented on code in PR #15159: URL: https://github.com/apache/datafusion/pull/15159#discussion_r1998600989 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## Review Comment: Had to remove this because now (even in main) it produces `SELECT * FROM (SELECT [1, 2, 3]

Re: [PR] fix: add an "expr_planners" method to SessionState [datafusion]

2025-03-17 Thread via GitHub
niebayes commented on code in PR #15119: URL: https://github.com/apache/datafusion/pull/15119#discussion_r1998497878 ## datafusion/core/src/execution/context/mod.rs: ## @@ -1632,7 +1632,7 @@ impl FunctionRegistry for SessionContext { } fn expr_planners(&self) -> Vec>

Re: [PR] fix: add an "expr_planners" method to SessionState [datafusion]

2025-03-17 Thread via GitHub
niebayes commented on PR #15119: URL: https://github.com/apache/datafusion/pull/15119#issuecomment-2729093006 @alamb Hi, I've written a test to demonstrate why it's more convenient and somewhat necessary to provide an `expr_planners` method for `SessionState`. I'm not sure if this tes

Re: [D] Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? [datafusion]

2025-03-17 Thread via GitHub
GitHub user alamb added a comment to the discussion: Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? Actually, here is a good example : https://github.com/apache/datafusion/blob/87eec43856a5d8cefef24d1ff85d375d2b58d8c2/datafusion/core/tests/user_defined/expr

[PR] Improve speed of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-17 Thread via GitHub
UBarney opened a new pull request, #15266: URL: https://github.com/apache/datafusion/pull/15266 ## Which issue does this PR close? part of #13998 ## Rationale for this change Achieved a 5x performance boost. Benchmark sql: `select id2, id4, first_value(v1 order

[I] Add documentation about how to plan custom expressions [datafusion]

2025-03-17 Thread via GitHub
alamb opened a new issue, #15267: URL: https://github.com/apache/datafusion/issues/15267 ### Is your feature request related to a problem or challenge? By default DataFusion doesn't support many operators, for example `->` ```sql > set datafusion.sql_parser.dialect = postgres

  1   2   3   >