Re: [I] Change mapping of SQL `VARCHAR` from `Utf8` to `Utf8View` [datafusion]

2025-03-15 Thread via GitHub
zhuqi-lucas commented on issue #15096: URL: https://github.com/apache/datafusion/issues/15096#issuecomment-2727226225 New subt_task: - [ ] Support Utf8View datatype for range queries -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-15 Thread via GitHub
zhuqi-lucas commented on PR #15239: URL: https://github.com/apache/datafusion/pull/15239#issuecomment-2727177209 > Thanks @zhuqi-lucas -- I have one request otherwise I think this looks good to me Thank you @alamb for review, i change back the original call to equivalent_names_and_ty

Re: [I] (RESPECT NULLS / IGNORE NULLS is syntax for window functions, not aggregate functions [datafusion]

2025-03-15 Thread via GitHub
Garamda commented on issue #15006: URL: https://github.com/apache/datafusion/issues/15006#issuecomment-2727165026 fyi) https://github.com/apache/datafusion/pull/15014#issuecomment-2725690181 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on issue #1536: URL: https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2727044778 Thanks @viirya. Yes, fully native scan is only used with native execution. Maybe the design is correct as it is. -- This is an automated message from the Apache Git Ser

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub
akurmustafa commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2727036019 > [@akurmustafa](https://github.com/akurmustafa) we can add it to the post now. Or we could always make another post "Using `WITH ORDER` with DataFusion..." > > I

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
Spaarsh commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727025175 The errors are due to one of the rules I enabled in one of my commits `SIM102`. Since I have some 10 other ruff rules to enable anyway, should I disable this one for now? --

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727021187 I’m not sure then. I’m away for the weekend and won’t be able to test until Monday. I can see then if I can reproduce. -- This is an automated message from the Apache Git S

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
Spaarsh commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727019630 The workflow is using `0.9.1` and I `pip install` the same. Still I don't see any errors on my local tests. -- This is an automated message from the Apache Git Service. To re

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727017583 You may be using a different version of ruff. These do change from time to time as new lints get added in. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
Spaarsh commented on PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#issuecomment-2727017123 I don't understand why the ruff tests are failing. My local ruff check shows no errors. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Reanimate Code Coverage [datafusion]

2025-03-15 Thread via GitHub
blaginin commented on PR #15256: URL: https://github.com/apache/datafusion/pull/15256#issuecomment-2727009428 To make this work, we'll need to [add](https://docs.codecov.com/docs/adding-the-codecov-token) `CODECOV_TOKEN` and add the [app](https://github.com/apps/codecov) to the repo (may al

Re: [I] Improved CI test coverage for rust features [datafusion]

2025-03-15 Thread via GitHub
blaginin commented on issue #15155: URL: https://github.com/apache/datafusion/issues/15155#issuecomment-2727008592 WIP PoC is [here](https://github.com/apache/datafusion/pull/15256). The way I think this should work is here: https://github.com/blaginin/datafusion/pull/4#issuecomment-2726555

[PR] Reanimate Code Coverage [datafusion]

2025-03-15 Thread via GitHub
blaginin opened a new pull request, #15256: URL: https://github.com/apache/datafusion/pull/15256 ## Which issue does this PR close? Related to https://github.com/apache/datafusion/issues/15155#issuecomment-2715406304 ## Rationale for this change We have quite a lot of co

[PR] Migrate user_defined tests to insta [datafusion]

2025-03-15 Thread via GitHub
shruti2522 opened a new pull request, #15255: URL: https://github.com/apache/datafusion/pull/15255 ## Which issue does this PR close? - Closes #15247 . ## Rationale for this change ## What changes are included in this PR? migrate tests in `datafusion/co

Re: [I] Different unnests on `plan_to_sql` are merged [datafusion]

2025-03-15 Thread via GitHub
blaginin commented on issue #15128: URL: https://github.com/apache/datafusion/issues/15128#issuecomment-2727002296 I also don't see other cases... They are possible in theory, and I guess by the time we get to them, we'll need to refactor the unparser quite thoroughly 🥲 -- This is an auto

Re: [PR] Only unnest source for `EmptyRelation` [datafusion]

2025-03-15 Thread via GitHub
blaginin commented on code in PR #15159: URL: https://github.com/apache/datafusion/pull/15159#discussion_r1997328648 ## datafusion/sql/src/unparser/plan.rs: ## @@ -377,8 +377,17 @@ impl Unparser<'_> { }; if self.dialect.unnest_as_table_factor()

Re: [I] [DISCUSSION] physical-plan-common crate ~and Revert the datasource - physical-plan Dependency~ [datafusion]

2025-03-15 Thread via GitHub
berkaysynnada commented on issue #15111: URL: https://github.com/apache/datafusion/issues/15111#issuecomment-2726959050 > physical-plan is not possible to import `datasource`, you would end up moving everything inside physical-plan. Why are you thinking so? If you worry about the cata

Re: [PR] build(deps): bump async-trait from 0.1.86 to 0.1.87 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] commented on PR #1046: URL: https://github.com/apache/datafusion-python/pull/1046#issuecomment-2726951093 Superseded by #1070. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] build(deps): bump async-trait from 0.1.86 to 0.1.88 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] opened a new pull request, #1070: URL: https://github.com/apache/datafusion-python/pull/1070 Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.86 to 0.1.88. Release notes Sourced from https://github.com/dtolnay/async-trait/releases";>async-trait's

Re: [PR] build(deps): bump async-trait from 0.1.86 to 0.1.87 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] closed pull request #1046: build(deps): bump async-trait from 0.1.86 to 0.1.87 URL: https://github.com/apache/datafusion-python/pull/1046 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] build(deps): bump pyo3-build-config from 0.23.4 to 0.24.0 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] opened a new pull request, #1067: URL: https://github.com/apache/datafusion-python/pull/1067 Bumps [pyo3-build-config](https://github.com/pyo3/pyo3) from 0.23.4 to 0.24.0. Release notes Sourced from https://github.com/pyo3/pyo3/releases";>pyo3-build-config's releas

[PR] build(deps): bump tokio from 1.43.0 to 1.44.1 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] opened a new pull request, #1069: URL: https://github.com/apache/datafusion-python/pull/1069 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.43.0 to 1.44.1. Release notes Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases. Tokio

[PR] build(deps): bump uuid from 1.13.1 to 1.16.0 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] opened a new pull request, #1068: URL: https://github.com/apache/datafusion-python/pull/1068 Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.13.1 to 1.16.0. Release notes Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases. v1.16.0

Re: [PR] build(deps): bump uuid from 1.13.1 to 1.15.1 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] commented on PR #1039: URL: https://github.com/apache/datafusion-python/pull/1039#issuecomment-2726950949 Superseded by #1068. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] build(deps): bump object_store from 0.11.2 to 0.12.0 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] opened a new pull request, #1071: URL: https://github.com/apache/datafusion-python/pull/1071 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.11.2 to 0.12.0. Changelog Sourced from https://github.com/apache/arrow-rs/blob/main/CHANGELOG-old.md";>object

Re: [PR] build(deps): bump tokio from 1.43.0 to 1.44.0 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] closed pull request #1047: build(deps): bump tokio from 1.43.0 to 1.44.0 URL: https://github.com/apache/datafusion-python/pull/1047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] build(deps): bump tokio from 1.43.0 to 1.44.0 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] commented on PR #1047: URL: https://github.com/apache/datafusion-python/pull/1047#issuecomment-2726951020 Superseded by #1069. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] build(deps): bump uuid from 1.13.1 to 1.15.1 [datafusion-python]

2025-03-15 Thread via GitHub
dependabot[bot] closed pull request #1039: build(deps): bump uuid from 1.13.1 to 1.15.1 URL: https://github.com/apache/datafusion-python/pull/1039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Migrate datasource tests to `insta` [datafusion]

2025-03-15 Thread via GitHub
shruti2522 commented on issue #15246: URL: https://github.com/apache/datafusion/issues/15246#issuecomment-2726938676 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1995496219 ## datafusion/sqllogictest/test_files/ddl.slt: ## @@ -855,3 +855,29 @@ DROP TABLE t1; statement ok DROP TABLE t2; + +statement count 0 +create table t(a int) as

Re: [I] `FROM` first in `SELECT` statements [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
iffyio closed issue #1400: `FROM` first in `SELECT` statements URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore: [FOLLOWUP] Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-15 Thread via GitHub
kazuyukitanimura commented on code in PR #1534: URL: https://github.com/apache/datafusion-comet/pull/1534#discussion_r1997219275 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -438,7 +438,7 @@ class CometSparkSessionExtensions op

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on issue #1510: URL: https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2726915643 The first image is live. There are a number of security warnings that we need to look into. https://hub.docker.com/r/apache/datafusion-comet/tags -- This is an a

Re: [PR] chore: Minor code cleanup in native scan type checking [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1537: URL: https://github.com/apache/datafusion-comet/pull/1537#discussion_r1997192540 ## spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala: ## @@ -82,11 +82,7 @@ abstract class ParquetReadSuite extends CometTestBase { }

Re: [I] `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is set [datafusion-comet]

2025-03-15 Thread via GitHub
viirya commented on issue #1536: URL: https://github.com/apache/datafusion-comet/issues/1536#issuecomment-2726904269 Hmm, I think it is a mistake. In the commit 1cca8d6f7bd2dfb8e1996bdd55ebe09d08eb8221, I transformed `CometScanExec` added for fully native scan in `CometScanRule` to `CometN

[PR] Improve eliminate_outer_join rule [datafusion]

2025-03-15 Thread via GitHub
suibianwanwank opened a new pull request, #15254: URL: https://github.com/apache/datafusion/pull/15254 ## Which issue does this PR close? - Closes #13232 ## Rationale for this change Inspired by the approach in #13249. This PR explores an alternative that does not r

Re: [PR] Add additional ruff suggestions [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on code in PR #1062: URL: https://github.com/apache/datafusion-python/pull/1062#discussion_r1996771284 ## examples/python-udf-comparisons.py: ## @@ -163,9 +163,9 @@ def udf_using_pyarrow_compute_impl( resultant_arr = pc.and_(resultant_arr, filtered_

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub
vlad-arista commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2724763313 I think the issue here is not `LazyMemoryExec` but `UnnestExec` which doesn't propagate output ordering of the columns not involved in unnesting (judging by the implementati

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996487546 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> { \n Limit: skip=0, fetch=

Re: [PR] chore: Upgrade `rand` crate and some other minor crates [datafusion]

2025-03-15 Thread via GitHub
comphead commented on code in PR #14967: URL: https://github.com/apache/datafusion/pull/14967#discussion_r1996412978 ## datafusion/core/tests/parquet/filter_pushdown.rs: ## @@ -65,7 +65,12 @@ fn generate_file(tempdir: &TempDir, props: WriterProperties) -> TestParquetFile t

Re: [PR] Add all missing table options to be handled in any order [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
tomershaniii commented on code in PR #1747: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1747#discussion_r1991268759 ## src/ast/dml.rs: ## @@ -138,6 +143,30 @@ pub struct CreateTable { pub engine: Option, pub comment: Option, pub auto_increment_off

Re: [I] [DISCUSS] Release DataFusion `46.0.1` Patch or `46.1.0` minor release (March 2025) [datafusion]

2025-03-15 Thread via GitHub
xudong963 commented on issue #15151: URL: https://github.com/apache/datafusion/issues/15151#issuecomment-2717222611 After all related issues are solved, should we go through the process of releasing again? Such as voting in dev list. -- This is an automated message from the Apache Git Se

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#issuecomment-2724609734 Thank you for all the work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] chore: revert "Upgrade to Spark 3.5.4 (#1471)" [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on PR #1493: URL: https://github.com/apache/datafusion-comet/pull/1493#issuecomment-2711566564 > @andygrove I was going to put a rpad fix in https://github.com/apache/datafusion-comet/pull/1482/files but if we decide to revert, I can separate the PR. Let me know.

[PR] Improve feature flag CI coverage `datafusion` and `datafusion-functions` [datafusion]

2025-03-15 Thread via GitHub
alamb opened a new pull request, #15203: URL: https://github.com/apache/datafusion/pull/15203 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/15155 - Follow on to https://github.com/apache/datafusion/pull/15156 ## Rationale for

Re: [I] Support for generating JSON formatted substrait plan [datafusion-python]

2025-03-15 Thread via GitHub
swayam0322 commented on issue #508: URL: https://github.com/apache/datafusion-python/issues/508#issuecomment-2724995989 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Use insta for `DataFrame` tests [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15165: URL: https://github.com/apache/datafusion/pull/15165#issuecomment-2725669379 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [I] Dropping Spark 3.3 support [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove closed issue #646: Dropping Spark 3.3 support URL: https://github.com/apache/datafusion-comet/issues/646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Per file filter evaluation [datafusion]

2025-03-15 Thread via GitHub
adriangb commented on code in PR #15057: URL: https://github.com/apache/datafusion/pull/15057#discussion_r1988141252 ## datafusion/datasource-parquet/src/source.rs: ## @@ -559,24 +556,8 @@ impl FileSource for ParquetSource { .predicate()

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2703664954 > Should [ScalarUDFImpl::invoke_batch](https://github.com/apache/datafusion/blob/43ecd9b807877946706628633308f73a4645de1f/datafusion/expr/src/udf.rs#L616) be marked as deprecated?

Re: [I] Implement tree explain for `CoalescePartitionsExec` [datafusion]

2025-03-15 Thread via GitHub
Standing-Man commented on issue #15195: URL: https://github.com/apache/datafusion/issues/15195#issuecomment-2723024655 > [@Standing-Man](https://github.com/Standing-Man) is this issue or what? this doesn't have any description for the above-mentioned issue You can check the details i

Re: [I] Support Push down expression evaluation in `TableProviders` [datafusion]

2025-03-15 Thread via GitHub
adriangb commented on issue #14993: URL: https://github.com/apache/datafusion/issues/14993#issuecomment-2704601029 > But roughly, DataFusion asks the table provider which expressions it can push-down, and the node is configured with both a projection expression and a filter expression. Exac

Re: [I] Publish official Docker images to Docker Hub under Apache account [datafusion-comet]

2025-03-15 Thread via GitHub
comphead commented on issue #1510: URL: https://github.com/apache/datafusion-comet/issues/1510#issuecomment-2726807145 thanks @andygrove I think it makes sense to me, Apache Spark also includes the platform and Comet has the platform variety as well. Although I'm not sure if platforms othe

Re: [PR] chore: Minor code cleanup in native scan type checking [datafusion-comet]

2025-03-15 Thread via GitHub
codecov-commenter commented on PR #1537: URL: https://github.com/apache/datafusion-comet/pull/1537#issuecomment-2726800741 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1537?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Support `EXPLAIN ... FORMAT ...` [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15166: URL: https://github.com/apache/datafusion/pull/15166 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub
alamb commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996808102 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1571,14 +1571,18 @@ async fn with_column_join_same_columns() -> Result<()> { assert_snapshot!( df_w

Re: [I] Expose to `GroupsAccumulator` whether all the groups are sorted [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #14991: URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2726553041 Here is the example / test case from @asubiotto in https://github.com/apache/datafusion/issues/14991#issuecomment-2720197770 In the following plan, both `Aggregate` sh

Re: [PR] Enable `used_underscore_binding` clippy lint [datafusion]

2025-03-15 Thread via GitHub
alamb merged PR #15189: URL: https://github.com/apache/datafusion/pull/15189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Update version to 46.0.1, add CHANGELOG (#15243) [datafusion]

2025-03-15 Thread via GitHub
xudong963 commented on PR #15244: URL: https://github.com/apache/datafusion/pull/15244#issuecomment-2726776760 > 🤔 something seems to be wrong here yeah, missed something, fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub
irenjj opened a new pull request, #15253: URL: https://github.com/apache/datafusion/pull/15253 ## Which issue does this PR close? - Closes #15252 ## Rationale for this change ## What changes are included in this PR? ## Are these changes test

Re: [PR] feat/improve ruff test coverage [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on PR #1055: URL: https://github.com/apache/datafusion-python/pull/1055#issuecomment-2710849717 #1056 contains the follow on work to apply additional rules -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] feat: `INSERT INTO` support [datafusion-ballista]

2025-03-15 Thread via GitHub
milenkovicm closed pull request #1177: feat: `INSERT INTO` support URL: https://github.com/apache/datafusion-ballista/pull/1177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Minor: More comment to aggregation fuzzer [datafusion]

2025-03-15 Thread via GitHub
2010YOUY01 opened a new pull request, #15048: URL: https://github.com/apache/datafusion/pull/15048 ## Which issue does this PR close? - Closes #. ## Rationale for this change I'm thinking about enhancing the sort fuzzer, so I checked our nice aggregate fuzzer

Re: [PR] chore: revert "Upgrade to Spark 3.5.4 (#1471)" [datafusion-comet]

2025-03-15 Thread via GitHub
kazuyukitanimura commented on PR #1493: URL: https://github.com/apache/datafusion-comet/pull/1493#issuecomment-2711531218 @andygrove I was going to put a rpad fix in https://github.com/apache/datafusion-comet/pull/1482/files but if we decide to revert, I can separate the PR. Let me know.

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2704070949 Related ticket: - https://github.com/apache/datafusion/issues/14993 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] feat: add `register_metadata` function for `GroupsAccumulator` [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15022: URL: https://github.com/apache/datafusion/pull/15022#discussion_r1988229738 ## datafusion/expr-common/src/groups_accumulator.rs: ## @@ -251,3 +261,18 @@ pub trait GroupsAccumulator: Send { /// compute, not `O(num_groups)` fn s

Re: [PR] Improve explain tree formatting for longer lines / word wrap [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15031: URL: https://github.com/apache/datafusion/pull/15031#discussion_r1983299366 ## datafusion/physical-plan/Cargo.toml: ## @@ -63,6 +63,8 @@ log = { workspace = true } parking_lot = { workspace = true } pin-project-lite = "^0.2.7" tokio = {

Re: [PR] Order Requirement Analysis [datafusion-site]

2025-03-15 Thread via GitHub
Omega359 commented on code in PR #58: URL: https://github.com/apache/datafusion-site/pull/58#discussion_r1987631033 ## content/blog/2025-03-05-ordering-analysis.md: ## @@ -0,0 +1,353 @@ +--- +layout: post +title: Analysis of Ordering for Better Plans +date: 2025-03-05 +author: M

Re: [I] Weekly Plan (Andrew Lamb) March 3, 2025 [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #14978: URL: https://github.com/apache/datafusion/issues/14978#issuecomment-2710258876 - Next week https://github.com/apache/datafusion/issues/15121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-15 Thread via GitHub
savaliyabhargav commented on issue #13815: URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2724623356 Dear @alamb , I hope you’re doing well. I’m very interested in contributing to the WASM support for DataFusion project as part of GSoC 2025. Enhancing embe

Re: [PR] feat: topk functionality for aggregates should support utf8view and largeutf8 [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15152: URL: https://github.com/apache/datafusion/pull/15152#issuecomment-2721226874 FYI @avantgardnerio -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Simplify display format of `AggregateFunctionExpr` [datafusion]

2025-03-15 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r1997046455 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2600,43 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for SqlDi

[I] Change in behavior for deep structure columns with the latest sql parser upgrade [datafusion]

2025-03-15 Thread via GitHub
adragomir opened a new issue, #15118: URL: https://github.com/apache/datafusion/issues/15118 ### Describe the bug with SQLParser 0.53, this works: ``` SELECT * FROM (meta_asset_featurization AS asset_meta INNER JOIN meta_asset_summary_metrics AS asset_metrics O

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-15 Thread via GitHub
onlyjackfrost commented on PR #15209: URL: https://github.com/apache/datafusion/pull/15209#issuecomment-2721725989 @eliaperantoni could you help review? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1525: URL: https://github.com/apache/datafusion-comet/pull/1525#discussion_r1996138171 ## docs/source/user-guide/tuning.md: ## @@ -23,12 +23,84 @@ Comet provides some tuning options to help you get the best performance from you ## Memory Tu

[PR] fix: re-enable GitHub discussions [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove opened a new pull request, #1532: URL: https://github.com/apache/datafusion-comet/pull/1532 ## Which issue does this PR close? N/A ## Rationale for this change Our GitHub discussions disappeared! Possibly related to an update in ASF tooling ment

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub
parthchandra commented on PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#issuecomment-2725354473 @mbutrovich, @comphead I've rebased this on main (after factoring out the object store related changes) and force pushed. If you could please take a look again. @andygro

[I] Unparse of logical plans with `LEFT ANTI` and `LEFT SEMI` joins generate invalid SQL [datafusion]

2025-03-15 Thread via GitHub
nuno-faria opened a new issue, #15127: URL: https://github.com/apache/datafusion/issues/15127 ### Describe the bug When unparsing a logical plan containing a `LeftAnti Join` or a `LeftSemi Join` operator with the `unparser::dialect::PostgreSqlDialect`, the resulting unparsed SQL cont

[PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-15 Thread via GitHub
Shreyaskr1409 opened a new pull request, #15225: URL: https://github.com/apache/datafusion/pull/15225 ## Which issue does this PR close? - Closes #15195 . ## Rationale for this change ## What changes are included in this PR? Changes explain_tree.slt and coalesce_pa

Re: [PR] docs: various improvements to tuning guide [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1525: URL: https://github.com/apache/datafusion-comet/pull/1525#discussion_r1996139134 ## docs/source/user-guide/tuning.md: ## @@ -17,18 +17,96 @@ specific language governing permissions and limitations under the License. --> -# Tuning Guid

[PR] fix: use common implementation of handling object store and hdfs urls for native_datafusion and native_iceberg_compat [datafusion-comet]

2025-03-15 Thread via GitHub
parthchandra opened a new pull request, #1494: URL: https://github.com/apache/datafusion-comet/pull/1494 Addresses part of comments in https://github.com/apache/datafusion-comet/pull/1443 Replaces the `register_object_store` method that provided hdfs support with a unified `prepare_o

[I] Add all functions to the Expr class so that they're chainable. [datafusion-python]

2025-03-15 Thread via GitHub
deanm opened a new issue, #1064: URL: https://github.com/apache/datafusion-python/issues/1064 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Instead of doing ```python df.select( F.abs( col("id1")

Re: [PR] Snowflake: Support dollar quoted comment when creating tables, views, and their fields [datafusion-sqlparser-rs]

2025-03-15 Thread via GitHub
iffyio commented on code in PR #1755: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1755#discussion_r1995819665 ## src/parser/mod.rs: ## @@ -6926,14 +6927,16 @@ impl<'a> Parser<'a> { let comment = if self.parse_keyword(Keyword::COMMENT) { let

[PR] datafusion-cli: add streaming state struct [datafusion]

2025-03-15 Thread via GitHub
shruti2522 opened a new pull request, #15234: URL: https://github.com/apache/datafusion/pull/15234 ## Which issue does this PR close? - Closes #14886 . ## Rationale for this change ## What changes are included in this PR? add `OutputStreamStruct` to av

Re: [PR] chore: Use Datafusion's existing empty stream [datafusion-comet]

2025-03-15 Thread via GitHub
codecov-commenter commented on PR #1517: URL: https://github.com/apache/datafusion-comet/pull/1517#issuecomment-2719120442 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1517?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Document guidelines for physical operator yielding [datafusion]

2025-03-15 Thread via GitHub
ozankabak commented on code in PR #15030: URL: https://github.com/apache/datafusion/pull/15030#discussion_r1984828017 ## datafusion/physical-plan/src/execution_plan.rs: ## @@ -260,13 +260,30 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync { /// used. /// Th

Re: [PR] Implement tree explain for `LocalLimitExec` [datafusion]

2025-03-15 Thread via GitHub
alamb commented on PR #15232: URL: https://github.com/apache/datafusion/pull/15232#issuecomment-2725553415 Thanks again @shruti2522 and @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-15 Thread via GitHub
onlyjackfrost commented on PR #15209: URL: https://github.com/apache/datafusion/pull/15209#issuecomment-2725191499 @eliaperantoni could I raise another PR for the others unary expressions and keep this PR for the PLUS unary expression? -- This is an automated message from the Apache Git

Re: [PR] chore: Minor code cleanup in native scan type-checking [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1537: URL: https://github.com/apache/datafusion-comet/pull/1537#discussion_r1997045440 ## spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala: ## @@ -199,15 +199,15 @@ class CometSparkSessionExtensions _,

[PR] chore: Minor code cleanup in native scan type-checking [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove opened a new pull request, #1537: URL: https://github.com/apache/datafusion-comet/pull/1537 ## Which issue does this PR close? N/A ## Rationale for this change Unify code for `native_datafusion` and `native_iceberg_compat` when checking for supp

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996054834 ## native/core/src/parquet/parquet_exec.rs: ## @@ -0,0 +1,141 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [I] Expose global context [datafusion-python]

2025-03-15 Thread via GitHub
timsaucer commented on issue #1045: URL: https://github.com/apache/datafusion-python/issues/1045#issuecomment-2710856896 I understand what you mean by "global context is already exposed" but I mean that it should be treated in the wrapper functions as well so that users get a nice python i

Re: [I] Upgrade Guide for DataFusion 46 does not include the array signatures change [datafusion]

2025-03-15 Thread via GitHub
jkosh44 commented on issue #15105: URL: https://github.com/apache/datafusion/issues/15105#issuecomment-2725983472 > [@jkosh44](https://github.com/jkosh44) Would you be able to add a note about array signature changes to https://github.com/apache/datafusion/blob/main/docs/source/library-user

[I] [substrait] Build basic test suite to validate produced Substrait plans [datafusion]

2025-03-15 Thread via GitHub
amoeba opened a new issue, #15069: URL: https://github.com/apache/datafusion/issues/15069 ### Is your feature request related to a problem or challenge? In https://github.com/apache/datafusion/issues/12244 we found that DataFusion was producing invalid Substrait plans. The `datafusion

Re: [PR] WIP: User defined sorting [datafusion]

2025-03-15 Thread via GitHub
tobixdev commented on PR #15106: URL: https://github.com/apache/datafusion/pull/15106#issuecomment-2725048476 Happy to hear any kind of feedback on that @paleolimbot . So take a look if you've time. Also good to hear that others also have these requirements in their projects :). Could you

Re: [PR] chore: Remove all subdependencies [datafusion-comet]

2025-03-15 Thread via GitHub
andygrove merged PR #1514: URL: https://github.com/apache/datafusion-comet/pull/1514 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-15 Thread via GitHub
mbutrovich commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996167458 ## native/core/src/parquet/mod.rs: ## @@ -620,12 +619,21 @@ fn get_batch_context<'a>(handle: jlong) -> Result<&'a mut BatchContext, CometErr } } -/

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-15 Thread via GitHub
jayzhan211 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996471597 ## datafusion/expr-common/src/signature.rs: ## @@ -865,6 +867,39 @@ impl Signature { volatility, } } + +/// Specialized Signature

Re: [I] [DISCUSS] Release DataFusion `46.0.1` Patch or `46.1.0` minor release (March 2025) [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #15151: URL: https://github.com/apache/datafusion/issues/15151#issuecomment-2724618093 - Unfortunately https://github.com/apache/datafusion/issues/15114 doesn't seem likely to make it into this release. I'll proceed with making backport PRs now -- This is an auto

Re: [PR] [branch-46] Fix wasm32 build on version 46 [datafusion]

2025-03-15 Thread via GitHub
alamb closed pull request #15229: [branch-46] Fix wasm32 build on version 46 URL: https://github.com/apache/datafusion/pull/15229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

  1   2   3   >