Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
xudong963 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000307800 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,7 +465,116 @@ impl FileFormat for ParquetFormat { } } +/// Apply necessary schema type co

[PR] Use `any` instead of `for_each` [datafusion]

2025-03-17 Thread via GitHub
xudong963 opened a new pull request, #15289: URL: https://github.com/apache/datafusion/pull/15289 ## Which issue does this PR close? - Closes #. ## Rationale for this change Is the code essentially looking for the first indication that statistics are availabl

Re: [PR] Add LOCK operation for ALTER TABLE [datafusion-sqlparser-rs]

2025-03-17 Thread via GitHub
iffyio merged PR #1768: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Add substrait tpch round trip tests from sql query [datafusion]

2025-03-17 Thread via GitHub
github-actions[bot] commented on PR #13888: URL: https://github.com/apache/datafusion/pull/13888#issuecomment-2731392253 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] feat: implement scripts for binary release build [datafusion-comet]

2025-03-17 Thread via GitHub
dpengpeng commented on PR #932: URL: https://github.com/apache/datafusion-comet/pull/932#issuecomment-2731538887 @parthchandra Consulting a question: In the current compilation script `dev/release/build-release-comet.sh`, the final invocation of the compilation command is `core-amd64-libs`

Re: [PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15244: URL: https://github.com/apache/datafusion/pull/15244#issuecomment-2730389755 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-17 Thread via GitHub
zhuqi-lucas commented on PR #15239: URL: https://github.com/apache/datafusion/pull/15239#issuecomment-2731620314 > Thanks @zhuqi-lucas -- this code looks really nice 👨‍🍳 👌 > > I think we should avoid API breakages but otherwise looks great to me Thank you for review @alamb , add

Re: [PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-17 Thread via GitHub
zhuqi-lucas commented on code in PR #15239: URL: https://github.com/apache/datafusion/pull/15239#discussion_r2000189083 ## datafusion/common/src/dfschema.rs: ## @@ -605,25 +582,49 @@ impl DFSchema { } /// Returns true if the two schemas have the same qualified named

Re: [I] remove `graphviz-rust` dependency [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1202: remove `graphviz-rust` dependency URL: https://github.com/apache/datafusion-ballista/issues/1202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2000179936 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for SqlD

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000156746 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,7 +465,116 @@ impl FileFormat for ParquetFormat { } } +/// Apply necessary schema type c

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-17 Thread via GitHub
xudong963 commented on code in PR #15268: URL: https://github.com/apache/datafusion/pull/15268#discussion_r2000109655 ## datafusion/datasource-parquet/src/file_format.rs: ## @@ -465,45 +465,103 @@ impl FileFormat for ParquetFormat { } } -/// Coerces the file schema if th

Re: [PR] minor: fix `data/sqlite` link [datafusion]

2025-03-17 Thread via GitHub
Weijun-H commented on code in PR #15286: URL: https://github.com/apache/datafusion/pull/15286#discussion_r2000123298 ## datafusion/sqllogictest/README.md: ## @@ -28,7 +28,7 @@ This crate is a submodule of DataFusion that contains an implementation of [sqll ## Overview This

Re: [I] `ScalarValue::try_from_array` holds onto input array for Struct, Lists types [datafusion]

2025-03-17 Thread via GitHub
chenkovsky commented on issue #15205: URL: https://github.com/apache/datafusion/issues/15205#issuecomment-2731509381 Hi, @Dandandan . it seems that there's no direct way to deep copy array in arrow? so I tested the following code. it seems that it works. ```rust let a = a

[PR] Improvement/improve wildcard error 15004 [datafusion]

2025-03-17 Thread via GitHub
Jiashu-Hu opened a new pull request, #15287: URL: https://github.com/apache/datafusion/pull/15287 ## Which issue does this PR close? - Closes [#15004](https://github.com/apache/datafusion/issues/15004). ## Rationale for this change The current error messages f

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-17 Thread via GitHub
comphead commented on PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#issuecomment-2731468304 @andygrove @kazuyukitanimura please have a second look Nested arrays and Iceberg compat support will be added in follow up PR -- This is an automated message from the Apach

Re: [I] Emit warning with attached `Diagnostic` when doing `= NULL` [datafusion]

2025-03-17 Thread via GitHub
changsun20 commented on issue #14434: URL: https://github.com/apache/datafusion/issues/14434#issuecomment-2731465313 Hi @eliaperantoni, Thank you for the detailed guidance! Here's my understanding of the next steps: 1. **Warning Scope** I'll implement the semantic appro

Re: [I] Support duckdb's `INSERT OR IGNORE INTO ...` [datafusion]

2025-03-17 Thread via GitHub
qazxcdswe123 closed issue #14966: Support duckdb's `INSERT OR IGNORE INTO ...` URL: https://github.com/apache/datafusion/issues/14966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[PR] minor: fix `data/sqlite` link [datafusion]

2025-03-17 Thread via GitHub
sdht0 opened a new pull request, #15286: URL: https://github.com/apache/datafusion/pull/15286 Just updated the link. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] feat(datafusion-functions-aggregate): add support for lists and other nested types in `min` and `max` [datafusion]

2025-03-17 Thread via GitHub
github-actions[bot] commented on PR #13991: URL: https://github.com/apache/datafusion/pull/13991#issuecomment-2731392221 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Migrate the following tests to `insta` [datafusion]

2025-03-17 Thread via GitHub
jsai28 commented on issue #15282: URL: https://github.com/apache/datafusion/issues/15282#issuecomment-2730914857 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] refactor: Move view and stream from `datasource` to `catalog` [datafusion]

2025-03-17 Thread via GitHub
logan-keede commented on code in PR #15260: URL: https://github.com/apache/datafusion/pull/15260#discussion_r159689 ## datafusion/catalog/Cargo.toml: ## @@ -35,17 +35,18 @@ arrow = { workspace = true } async-trait = { workspace = true } dashmap = { workspace = true } data

Re: [I] Require Comet 0.6 Docker image for Spark 3.5.5 [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1509: URL: https://github.com/apache/datafusion-comet/issues/1509#issuecomment-2731240978 @RaghavendraGanesh We now have a 0.7.0 image for Spark 3.5.4. Perhaps that helps? https://hub.docker.com/r/apache/datafusion-comet/tags -- This is an automated m

Re: [I] Expose user defined functions in the FFI [datafusion]

2025-03-17 Thread via GitHub
Dev79844 commented on issue #14562: URL: https://github.com/apache/datafusion/issues/14562#issuecomment-2731335616 Hey @timsaucer I would like to contribute. Can I take up window and table? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] refactor: Move view and stream from `datasource` to `catalog` [datafusion]

2025-03-17 Thread via GitHub
logan-keede commented on code in PR #15260: URL: https://github.com/apache/datafusion/pull/15260#discussion_r158674 ## datafusion/catalog/src/lib.rs: ## @@ -46,4 +46,94 @@ pub use r#async::*; pub use schema::*; pub use session::*; pub use table::*; +pub mod stream; pub m

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r104973 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15284: URL: https://github.com/apache/datafusion/pull/15284#issuecomment-2731258754 > lgtm thanks @alamb not sure why CI failed though Thanks @comphead -- I am not sure either -- I'll figure t out -- This is an automated message from the Apache Git Service. T

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2731279083 Is it possible to modify `Display` for Expr for explain statement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r110359 ## datafusion/expr/src/expr.rs: ## @@ -2607,11 +2793,23 @@ pub(crate) fn schema_name_from_exprs_comma_separated_without_space( schema_name_from_exprs_inne

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
jayzhan211 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r107560 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2731258066 > > there is one test failing > > No idea why this test should have failed. Hoping that the rerun will fix it. If it was the surefire upload issue then the fix is

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15284: URL: https://github.com/apache/datafusion/pull/15284#discussion_r1999897889 ## datafusion/catalog/src/session.rs: ## @@ -145,7 +145,7 @@ impl From<&dyn Session> for TaskContext { state.scalar_functions().clone(), st

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2731253073 > there is one test failing No idea why this test should have failed. Hoping that the rerun will fix it. -- This is an automated message from the Apache Git Service

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1999893696 ## native/core/src/parquet/parquet_exec.rs: ## @@ -0,0 +1,139 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove closed issue #1510: Publish official Docker images to Docker Hub under Apache account URL: https://github.com/apache/datafusion-comet/issues/1510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1510: URL: https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2731240199 Closing as complete: https://hub.docker.com/r/apache/datafusion-comet/tags -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
irenjj commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2731239713 > Thank you @irenjj -- this looks really nice > > I think there are a few ways we should improve the comments, ~but I'll directly push those to this branch.~ > > Thanks a

Re: [PR] minor: make `graphviz-rust` dependency optional [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1203: URL: https://github.com/apache/datafusion-ballista/pull/1203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] fix: Refactor CometScanRule and fix bugs [datafusion-comet]

2025-03-17 Thread via GitHub
parthchandra commented on code in PR #1483: URL: https://github.com/apache/datafusion-comet/pull/1483#discussion_r1999846503 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -188,69 +185,62 @@ class CometSparkSessionExtensions sc

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1536: URL: https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2730761195 I will close this issue since the design does seem correct. We should remove the TODO though. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Build failure in flight_sql.rs [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #895: URL: https://github.com/apache/datafusion-ballista/issues/895#issuecomment-2731017743 It looks like this issue has been fixed, is it ok to close this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] chore(deps): bump ring from 0.17.8 to 0.17.14 in /python [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm merged PR #1206: URL: https://github.com/apache/datafusion-ballista/pull/1206 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

[PR] add LOCK operation for ALTER TABLE [datafusion-sqlparser-rs]

2025-03-17 Thread via GitHub
MohamedAbdeen21 opened a new pull request, #1768: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1768 Closes #1665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] Example program gives `mismatched types` error on `ParquetReadOptions` [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #1021: URL: https://github.com/apache/datafusion-ballista/issues/1021#issuecomment-2730957226 I'm closing this issue as it was caused due to datafusion/ballista version mismatch. Ballista version signals which datafusion it supports, ballista 43 supports

Re: [I] Example program gives `mismatched types` error on `ParquetReadOptions` [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm closed issue #1021: Example program gives `mismatched types` error on `ParquetReadOptions` URL: https://github.com/apache/datafusion-ballista/issues/1021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
adriangb commented on code in PR #15284: URL: https://github.com/apache/datafusion/pull/15284#discussion_r1999690016 ## datafusion/catalog/src/session.rs: ## @@ -145,7 +145,7 @@ impl From<&dyn Session> for TaskContext { state.scalar_functions().clone(),

Re: [PR] build: Use unique name for surefire artifacts [datafusion-comet]

2025-03-17 Thread via GitHub
codecov-commenter commented on PR #1544: URL: https://github.com/apache/datafusion-comet/pull/1544#issuecomment-2730955992 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1544?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] doc: remove arrow from doc title [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm opened a new pull request, #1207: URL: https://github.com/apache/datafusion-ballista/pull/1207 # Which issue does this PR close? Closes #1016. # Rationale for this change # What changes are included in this PR? # Are there any user-faci

[I] Spark SQL test failures in native_datafusion scan [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new issue, #1545: URL: https://github.com/apache/datafusion-comet/issues/1545 ### Describe the bug There are more than 100 failures in `core2` suite that seem to have the same root cause: ``` - Spark vectorized reader - without partition data column - sele

[PR] chore(deps): bump ring from 0.17.8 to 0.17.14 in /python [datafusion-ballista]

2025-03-17 Thread via GitHub
dependabot[bot] opened a new pull request, #1206: URL: https://github.com/apache/datafusion-ballista/pull/1206 Bumps [ring](https://github.com/briansmith/ring) from 0.17.8 to 0.17.14. Changelog Sourced from https://github.com/briansmith/ring/blob/main/RELEASES.md";>ring's changelog

[I] Add support for S3 Object Store in default binaries [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm opened a new issue, #1205: URL: https://github.com/apache/datafusion-ballista/issues/1205 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** With #1200 publishing executor and scheduler containers, there is need to su

Re: [PR] Only unnest source for `EmptyRelation` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15159: URL: https://github.com/apache/datafusion/pull/15159#discussion_r1999670066 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## Review Comment: I don't understand this comment -- doesn't this PR *add* a new test? ## datafusion/s

Re: [PR] doc: update docker related documentation [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1204: URL: https://github.com/apache/datafusion-ballista/pull/1204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] `Starting a Ballista Cluster using Docker` documentation is incorrect [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1198: `Starting a Ballista Cluster using Docker` documentation is incorrect URL: https://github.com/apache/datafusion-ballista/issues/1198 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] chore: update python dependencies [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1197: URL: https://github.com/apache/datafusion-ballista/pull/1197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] Lack of images in docker hub [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove closed issue #1044: Lack of images in docker hub URL: https://github.com/apache/datafusion-ballista/issues/1044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] feat: publish docker containers for executor and scheduler [datafusion-ballista]

2025-03-17 Thread via GitHub
andygrove merged PR #1200: URL: https://github.com/apache/datafusion-ballista/pull/1200 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] chore: Enable Spark SQL tests for native_datafusion [datafusion-comet]

2025-03-17 Thread via GitHub
codecov-commenter commented on PR #1543: URL: https://github.com/apache/datafusion-comet/pull/1543#issuecomment-2730917939 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1543?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

[PR] WIP: Test arrow-rs 54.3.0 upgrade [datafusion]

2025-03-17 Thread via GitHub
alamb opened a new pull request, #15285: URL: https://github.com/apache/datafusion/pull/15285 ## Which issue does this PR close? - part of https://github.com/apache/arrow-rs/issues/7107 ## Rationale for this change This tests with the newest version of arrow as a

Re: [I] Cannot build a fresh checkout, configure_me has yanked deps [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm closed issue #1042: Cannot build a fresh checkout, configure_me has yanked deps URL: https://github.com/apache/datafusion-ballista/issues/1042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] Cannot build a fresh checkout, configure_me has yanked deps [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm commented on issue #1042: URL: https://github.com/apache/datafusion-ballista/issues/1042#issuecomment-2730896362 > can we close this issue, please? it does not look as ballista problem closing this issue -- This is an automated message from the Apache Git Service. To r

Re: [PR] chore(deps): bump ring from 0.17.11 to 0.17.13 [datafusion-ballista]

2025-03-17 Thread via GitHub
milenkovicm merged PR #1199: URL: https://github.com/apache/datafusion-ballista/pull/1199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubsc

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15284: URL: https://github.com/apache/datafusion/pull/15284#discussion_r1999606913 ## datafusion/catalog/src/session.rs: ## @@ -145,7 +145,7 @@ impl From<&dyn Session> for TaskContext { state.scalar_functions().clone(), st

Re: [I] Add SQL examples to window functions: `nth_value`, etc [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #13399: URL: https://github.com/apache/datafusion/issues/13399#issuecomment-2730849740 > [@sageraven1](https://github.com/sageraven1) , Are you still working on this? I see that PR was marked as stale and got closed. If you aren't working on this, I would like to pi

Re: [I] Run / test Datafusion with JSON Bench from ClickHouse [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #14874: URL: https://github.com/apache/datafusion/issues/14874#issuecomment-2730836455 > Add specialized support in arrow-rs for variant binary types (specifically for the metadata columns) I think this will be a fun project for the right type of person. I

Re: [PR] Migrate user_defined tests to insta [datafusion]

2025-03-17 Thread via GitHub
shruti2522 commented on PR #15255: URL: https://github.com/apache/datafusion/pull/15255#issuecomment-2730590045 > > I tried allow_duplicates!(), but it gets a bit tricky with async functions > > Can you please explain more on this? I tried modifying _async_ `run_and_compare_query` an

Re: [I] [Epic] Add snapshot tests (migrate to `insta` for tests) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15178: URL: https://github.com/apache/datafusion/issues/15178#issuecomment-2730830528 > [@alamb](https://github.com/alamb) can I ask you to put "good first issue" on tickets in the list if you're happy with them? I don't think I have permission to do that Do

[PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-17 Thread via GitHub
alamb opened a new pull request, #15284: URL: https://github.com/apache/datafusion/pull/15284 ## Which issue does this PR close? ## Rationale for this change - Found while reviewing https://github.com/apache/datafusion/pull/15263 from @adriangb Some of the newer Dat

Re: [PR] Fix predicate pushdown for custom SchemaAdapters [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15263: URL: https://github.com/apache/datafusion/pull/15263#discussion_r1999523980 ## datafusion/core/src/datasource/physical_plan/parquet.rs: ## @@ -224,6 +224,64 @@ mod tests { ) } +#[tokio::test] +async fn test_pushdown_w

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove closed issue #1536: `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set URL: https://github.com/apache/datafusion-comet/issues/1536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Analysis to support`SortPreservingMerge` --> `ProgressiveEval` [datafusion]

2025-03-17 Thread via GitHub
suremarc commented on issue #15191: URL: https://github.com/apache/datafusion/issues/15191#issuecomment-2730732469 > I think it uses [FileGroupPartitioner](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/struct.FileGroupPartitioner.html) that maintains the same orderin

[PR] build: Use unique name for surefire artifacts [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new pull request, #1544: URL: https://github.com/apache/datafusion-comet/pull/1544 ## Which issue does this PR close? N/A ## Rationale for this change I have recently seen some build failures due to: ``` Error: Failed to Create

Re: [I] Remove the need for registering an ObjectStore for remote files [datafusion-python]

2025-03-17 Thread via GitHub
kylebarron commented on issue #899: URL: https://github.com/apache/datafusion-python/issues/899#issuecomment-2730730310 I published `pyo3-object_store` 0.1, which works with pyo3 0.23, and `pyo3-object_store` 0.2, which works with pyo3 0.24. But both of these require `object_store` 0.12, s

Re: [PR] Add CatalogProvider and SchemaProvider to FFI Crate [datafusion]

2025-03-17 Thread via GitHub
timsaucer commented on PR #15280: URL: https://github.com/apache/datafusion/pull/15280#issuecomment-2730732221 Oh, good point. Added in latest push. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Implement tree explain for UnionExec [datafusion]

2025-03-17 Thread via GitHub
alamb merged PR #15278: URL: https://github.com/apache/datafusion/pull/15278 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

[PR] chore: Enable Spark SQL tests for native_datafusion [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new pull request, #1543: URL: https://github.com/apache/datafusion-comet/pull/1543 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [I] Decorrelate scalar subqueries with more complex filter expressions [datafusion]

2025-03-17 Thread via GitHub
duongcongtoai commented on issue #14554: URL: https://github.com/apache/datafusion/issues/14554#issuecomment-2730673982 From this [PR](https://github.com/apache/datafusion/pull/6457), there are several types of query mentioned that need support 1. In Subquery contains limit/order by ``

Re: [I] Spark SQL test failures in native_iceberg_compat mode [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1542: URL: https://github.com/apache/datafusion-comet/issues/1542#issuecomment-2730625465 @parthchandra @mbutrovich fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-17 Thread via GitHub
mbutrovich commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1999472634 ## native/core/src/parquet/mod.rs: ## @@ -46,23 +47,22 @@ use self::util::jni::TypePromotionInfo; use crate::execution::operators::ExecutionError; use cra

[I] Spark SQL test failures in native_iceberg_compat mode [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new issue, #1542: URL: https://github.com/apache/datafusion-comet/issues/1542 ### Describe the bug This issue is to track Spark SQL test failures in native_iceberg_compat mode. - Comet tries to read JSON files with Parquet reader ### Steps to reprod

Re: [PR] chore: Enable Spark SQL tests for native_iceberg_compat [datafusion-comet]

2025-03-17 Thread via GitHub
codecov-commenter commented on PR #1541: URL: https://github.com/apache/datafusion-comet/pull/1541#issuecomment-2730586578 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1541?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Migrate the following tests to `insta` [datafusion]

2025-03-17 Thread via GitHub
blaginin commented on issue #15282: URL: https://github.com/apache/datafusion/issues/15282#issuecomment-2730556871 Thanks, added to the list 🌻 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Enable parquet filter pushdown by default [datafusion]

2025-03-17 Thread via GitHub
adriangb commented on issue #3463: URL: https://github.com/apache/datafusion/issues/3463#issuecomment-2730550280 I don't think this needs to block but I'll point out that I have a PR up for a bug from the interaction between `SchemaAdapter` and parquet filter pushdown: https://github.com/ap

Re: [PR] fix: Refactor CometScanRule and fix bugs [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on code in PR #1483: URL: https://github.com/apache/datafusion-comet/pull/1483#discussion_r1999438466 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -188,69 +185,62 @@ class CometSparkSessionExtensions scanE

Re: [PR] docs: Add changelog for 0.7.0 release [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove merged PR #1527: URL: https://github.com/apache/datafusion-comet/pull/1527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] datafusion-cli: add streaming state struct [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15234: URL: https://github.com/apache/datafusion/pull/15234#issuecomment-2730540542 > I only had time to take a quick glance - but could this functionality be added to datafusion so it could be used by other apps that have CLIs built on datafusion? Seems like a

Re: [PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-17 Thread via GitHub
kazuyukitanimura commented on PR #1534: URL: https://github.com/apache/datafusion-comet/pull/1534#issuecomment-2730537142 Thanks @andygrove -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Add CatalogProvider and SchemaProvider to FFI Crate [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15280: URL: https://github.com/apache/datafusion/pull/15280#issuecomment-2730531904 Thank you @timsaucer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Migrate dataframe tests to `insta` [datafusion]

2025-03-17 Thread via GitHub
alamb merged PR #15262: URL: https://github.com/apache/datafusion/pull/15262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] [Epic] A collection of FFI related tasks [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15283: URL: https://github.com/apache/datafusion/issues/15283#issuecomment-2730526009 FYI @timsaucer in case you have other items you want to add here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] [Epic] A collection of FFI related tasks [datafusion]

2025-03-17 Thread via GitHub
alamb opened a new issue, #15283: URL: https://github.com/apache/datafusion/issues/15283 ### Is your feature request related to a problem or challenge? We are adding FFI bindings to Datafusion (see https://crates.io/crates/datafusion-ffi) mostly for API stability (e.g. so python wrap

Re: [I] March 17, 2025: This week(s) in DataFusion [datafusion]

2025-03-17 Thread via GitHub
alamb commented on issue #15269: URL: https://github.com/apache/datafusion/issues/15269#issuecomment-2730516080 Oh, and of course @timsaucer is cranking out FFI bindings like - https://github.com/apache/datafusion/pull/15280 -- This is an automated message from the Apache Git Service.

Re: [I] Migrate dataframe tests to `insta` [datafusion]

2025-03-17 Thread via GitHub
alamb closed issue #15245: Migrate dataframe tests to `insta` URL: https://github.com/apache/datafusion/issues/15245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-17 Thread via GitHub
alamb commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r1999392980 ## datafusion/expr/src/expr.rs: ## @@ -2607,11 +2793,23 @@ pub(crate) fn schema_name_from_exprs_comma_separated_without_space( schema_name_from_exprs_inner(exp

Re: [PR] fix: Refactor CometScanRule and fix bugs [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on code in PR #1483: URL: https://github.com/apache/datafusion-comet/pull/1483#discussion_r1999378679 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -188,69 +185,62 @@ class CometSparkSessionExtensions scanE

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove commented on issue #1510: URL: https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2730485723 We can run `docker scout` locally. ``` docker scout cves apache/datafusion-comet:0.5.0-spark3.4.3-scala2.12-java11 ``` There are some CVEs in dependenci

Re: [PR] chore: remove deprecated variants of UDF's invoke (invoke, invoke_no_args, invoke_batch) [datafusion]

2025-03-17 Thread via GitHub
alamb commented on PR #15123: URL: https://github.com/apache/datafusion/pull/15123#issuecomment-2729560442 🔨 let's go -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove merged PR #1534: URL: https://github.com/apache/datafusion-comet/pull/1534 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-17 Thread via GitHub
matthewmturner commented on issue #13815: URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2730457784 > 4. Stretch goal -- prototype how we would support WASM user defined functions On this point, I was able to get WASM UDFs working in [dft](https://github.com/data

[PR] chore: Enable Spark SQL tests for native_iceberg_compat [datafusion-comet]

2025-03-17 Thread via GitHub
andygrove opened a new pull request, #1541: URL: https://github.com/apache/datafusion-comet/pull/1541 ## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/1489 ## Rationale for this change ## What changes are include

  1   2   3   >