Re: [PR] Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
iffyio commented on code in PR #1538: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1538#discussion_r1896650574 ## src/parser/mod.rs: ## @@ -11375,7 +11466,11 @@ impl<'a> Parser<'a> { } else { let object_type = self.parse_one

Re: [PR] [substrait] Add support for ExtensionTable [datafusion]

2024-12-24 Thread via GitHub
ccciudatu commented on PR #13772: URL: https://github.com/apache/datafusion/pull/13772#issuecomment-2560986285 @vbarua I rebased on top of https://github.com/apache/datafusion/pull/13803 and added a `consume_extension_table` method to the `SubstraitConsumer` trait. I chose to pass the sch

Re: [PR] Support Snowflake Update-From-Select [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
iffyio commented on code in PR #1604: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1604#discussion_r1896652348 ## src/ast/spans.rs: ## @@ -2106,6 +2106,15 @@ impl Spanned for SelectInto { } } +impl Spanned for UpdateTableFromKind { +fn span(&self) ->

[I] org.apache.spark.sql.catalyst.expressions.BoundReference cannot be cast to class org.apache.spark.sql.ColumnarExpression [datafusion-comet]

2024-12-24 Thread via GitHub
kazuyukitanimura opened a new issue, #1197: URL: https://github.com/apache/datafusion-comet/issues/1197 ### Describe the bug In SparkSessionExtensionSuite, there are 4 test failures with `org.apache.spark.sql.catalyst.expressions.BoundReference cannot be cast to class org.apache.spa

Re: [PR] Parse Postgres's LOCK TABLE statement [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
iffyio commented on code in PR #1614: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1614#discussion_r1896659962 ## src/parser/mod.rs: ## @@ -9604,7 +9604,13 @@ impl<'a> Parser<'a> { top = Some(self.parse_top()?); } -let projection =

Re: [PR] Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing [datafusion]

2024-12-24 Thread via GitHub
goldmedal commented on code in PR #13880: URL: https://github.com/apache/datafusion/pull/13880#discussion_r1896665295 ## datafusion/sql/src/unparser/extension_unparser.rs: ## @@ -0,0 +1,66 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[PR] Minor: change visibility of hash join utils [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada opened a new pull request, #13893: URL: https://github.com/apache/datafusion/pull/13893 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

Re: [PR] Improve parsing performance by reducing token cloning [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587#issuecomment-2561234597 I merged up to resolve a conflict. While reviewing the code I have some ideas on how to simplify it -- I'll try and make a PR later today. It will be a fun coding exercuse

Re: [I] Introduce a way to represent constrained statistics / bounds on values in Statistics [datafusion]

2024-12-24 Thread via GitHub
ozankabak commented on issue #8078: URL: https://github.com/apache/datafusion/issues/8078#issuecomment-2561235472 > I don't think the arrow proposal handles the subtelty about intervals, known facts, etc but we should at least be aware of them (thanks @edmondop for pointing this out to me)

Re: [PR] Fix `recursive-protection` feature flag [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13887: URL: https://github.com/apache/datafusion/pull/13887#issuecomment-2561235409 > LGTM. > > One concern is that some users may depend on the feature, so maybe we should highlight the change at the next release log. THis is a good point -- I believe 44

Re: [I] Introduce a way to represent constrained statistics / bounds on values in Statistics [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #8078: URL: https://github.com/apache/datafusion/issues/8078#issuecomment-2561149780 FWIW there is a current move to add statistics into the arrow format itself: - https://github.com/apache/arrow/pull/45058 I actually think we could standardize on converting

Re: [I] Find a way to communicate the ordering of a file back with the existing listing table implementation [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13891: URL: https://github.com/apache/datafusion/issues/13891#issuecomment-2561155800 One way to do this would be to write DataFusion specific metadata into the files (e..g add something to https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterPropert

Re: [PR] Find keywords using perfect hashing [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1590: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1590#issuecomment-2561210721 Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look -- This is an automated messag

Re: [PR] Support Snowflake Update-From-Select [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb merged PR #1604: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1604 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Slightly faster keyword lookups [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1591: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1591#issuecomment-2561210619 Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look -- This is an automated messag

Re: [PR] Improve parsing performance by reducing token cloning [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587#issuecomment-2561212194 I am going to merge this PR up to resolve the conflicts and make some of the suggested improvements in documentation, etc as a follow on PR -- This is an automated me

Re: [I] Different behavior in datafusion 35.0.0 in reading hive-partitioned parquet data [datafusion-python]

2024-12-24 Thread via GitHub
kylebarron commented on issue #579: URL: https://github.com/apache/datafusion-python/issues/579#issuecomment-2561217360 This is probably a duplicate of https://github.com/apache/datafusion-python/issues/957 -- This is an automated message from the Apache Git Service. To respond to the me

[PR] Add support for MYSQL's `RENAME TABLE` expr [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
wugeer opened a new pull request, #1616: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1616 This PR supports `RENAME TABLE` clause for mysql dialect,. For more information, please refer to: https://dev.mysql.com/doc/refman/9.1/en/rename-table.html This resolves issue

Re: [I] Find a way to communicate the ordering of a file back with the existing listing table implementation [datafusion]

2024-12-24 Thread via GitHub
zhuqi-lucas commented on issue #13891: URL: https://github.com/apache/datafusion/issues/13891#issuecomment-2560936037 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] Support (order by / sort) for DataFrameWriteOptions [datafusion]

2024-12-24 Thread via GitHub
Dandandan closed issue #13873: Support (order by / sort) for DataFrameWriteOptions URL: https://github.com/apache/datafusion/issues/13873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Support (order by / sort) for DataFrameWriteOptions [datafusion]

2024-12-24 Thread via GitHub
Dandandan merged PR #13874: URL: https://github.com/apache/datafusion/pull/13874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Support (order by / sort) for DataFrameWriteOptions [datafusion]

2024-12-24 Thread via GitHub
Dandandan commented on PR #13874: URL: https://github.com/apache/datafusion/pull/13874#issuecomment-2560941432 Thank you @zhuqi-lucas -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] chore(deps): update parquet requirement from 53.3.0 to 54.0.0 [datafusion]

2024-12-24 Thread via GitHub
dependabot[bot] opened a new pull request, #13892: URL: https://github.com/apache/datafusion/pull/13892 Updates the requirements on [parquet](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/main/CHANGE

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1896751558 ## datafusion/expr/src/logical_plan/invariants.rs: ## @@ -15,14 +15,98 @@ // specific language governing permissions and limitations // under the License.

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1896754835 ## datafusion/optimizer/src/optimizer.rs: ## @@ -384,9 +396,26 @@ impl Optimizer { // rule handles recursion itself

[PR] Minor: change the sort merge join emission as incremental [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada opened a new pull request, #13894: URL: https://github.com/apache/datafusion/pull/13894 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

[PR] Minor: Avoid emitting empty batches in partial sort [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada opened a new pull request, #13895: URL: https://github.com/apache/datafusion/pull/13895 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tes

Re: [PR] Prepare for 44.0.0 release: version and changelog [datafusion]

2024-12-24 Thread via GitHub
goldmedal commented on code in PR #13882: URL: https://github.com/apache/datafusion/pull/13882#discussion_r1896668065 ## dev/release/generate-changelog.py: ## @@ -44,6 +44,7 @@ def generate_changelog(repo, repo_name, tag1, tag2, version): unique_pulls = [] all_pulls =

[I] Add support of rand() expression [datafusion-comet]

2024-12-24 Thread via GitHub
akupchinskiy opened a new issue, #1198: URL: https://github.com/apache/datafusion-comet/issues/1198 ### What is the problem the feature request solves? Native-comet implementation of spark rand() function. ### Describe the potential solution _No response_ ### Addi

[PR] feat: rand expression support [datafusion-comet]

2024-12-24 Thread via GitHub
akupchinskiy opened a new pull request, #1199: URL: https://github.com/apache/datafusion-comet/pull/1199 ## Which issue does this PR close? Closes [#1198](https://github.com/apache/datafusion-comet/issues/1198) ## Rationale for this change Support of the spark

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1896724308 ## datafusion/expr/src/logical_plan/invariants.rs: ## @@ -15,14 +15,98 @@ // specific language governing permissions and limitations // under the License. -use c

Re: [PR] Prepare for 44.0.0 release: version and changelog [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13882: URL: https://github.com/apache/datafusion/pull/13882#discussion_r1896731319 ## dev/release/generate-changelog.py: ## @@ -44,6 +44,7 @@ def generate_changelog(repo, repo_name, tag1, tag2, version): unique_pulls = [] all_pulls = []

Re: [PR] Minor: Avoid emitting empty batches in partial sort [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13895: URL: https://github.com/apache/datafusion/pull/13895#issuecomment-2561102880 FYI @comphead and @korowa -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Minor: change the sort merge join emission as incremental [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13894: URL: https://github.com/apache/datafusion/pull/13894#issuecomment-2561103467 I wonder if there is some way / reason to test this (I realize it just a property, I was worried it might be broken in the future) -- This is an automated message from the Apache Git

Re: [PR] Fix `recursive-protection` feature flag [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13887: URL: https://github.com/apache/datafusion/pull/13887#issuecomment-2561107144 FYI @buraksenn and @peter-toth -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Release DataFusion `44.0.0` [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13334: URL: https://github.com/apache/datafusion/issues/13334#issuecomment-2561104600 I think all we are waiting on now is someone to review / approve (or fix in some other way) the `recrusive-protection` feature: - https://github.com/apache/datafusion/pull/13887

Re: [PR] Prepare for 44.0.0 release: version and changelog [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13882: URL: https://github.com/apache/datafusion/pull/13882#issuecomment-2561105998 Thank you @phillipleblanc and @goldmedal I am waiting on the final ticket required for the 44 release: - https://github.com/apache/datafusion/issues/13334#issuecomment-256110

Re: [PR] chore(deps): update parquet requirement from 53.3.0 to 54.0.0 [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13892: URL: https://github.com/apache/datafusion/pull/13892#issuecomment-2561106598 - See https://github.com/apache/datafusion/pull/13663 where I have done the upgrade work -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Minor: change the sort merge join emission as incremental [datafusion]

2024-12-24 Thread via GitHub
alamb merged PR #13894: URL: https://github.com/apache/datafusion/pull/13894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Test CI tests without `Swatinem/rust-cache@v2` [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13889: URL: https://github.com/apache/datafusion/pull/13889#issuecomment-2561109842 Test complete -- answer is the cacheing doesn't seem to help much: - https://github.com/apache/datafusion/pull/13876#discussion_r1896192661 -- This is an automated message from the

Re: [PR] Test CI tests without `Swatinem/rust-cache@v2` [datafusion]

2024-12-24 Thread via GitHub
alamb closed pull request #13889: Test CI tests without `Swatinem/rust-cache@v2` URL: https://github.com/apache/datafusion/pull/13889 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[I] Downloading IMDB dataset for benchmarks gives 404 Not Found [datafusion]

2024-12-24 Thread via GitHub
alihan-synnada opened a new issue, #13896: URL: https://github.com/apache/datafusion/issues/13896 ### Describe the bug Attempting to download the IMDB dataset gives the following error: ``` tar: Error opening archive: Unrecognized archive format ``` An `IMDB.tgz` is

Re: [PR] ci improvements [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13876: URL: https://github.com/apache/datafusion/pull/13876#discussion_r1896735651 ## .github/actions/setup-builder/action.yaml: ## @@ -42,6 +42,8 @@ runs: "${RETRY[@]}" rustup component add rustfmt - name: Configure rust runtime env

Re: [PR] Support unparsing implicit lateral `UNNEST` plan to SQL text [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13824: URL: https://github.com/apache/datafusion/pull/13824#issuecomment-2561182687 I plan to merge tomorrow unless someone beats me to it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13880: URL: https://github.com/apache/datafusion/pull/13880#discussion_r1896789436 ## datafusion-examples/examples/plan_to_sql.rs: ## @@ -152,3 +172,144 @@ async fn round_trip_plan_to_sql_demo() -> Result<()> { Ok(()) } + +#[derive(Debug, P

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13651: URL: https://github.com/apache/datafusion/pull/13651#issuecomment-2561158792 My performance benchmarks show also show now difference βœ… ``` ++ critcmp main 13525_invariant-checking-for-implicit-LP-changes group

[PR] Implement maintains_input_order for AggregateExec [datafusion]

2024-12-24 Thread via GitHub
alihan-synnada opened a new pull request, #13897: URL: https://github.com/apache/datafusion/pull/13897 ## Which issue does this PR close? None ## Rationale for this change `maintains_input_order` helps with sort pushdown optimization. As explained in [`InputOrderMode` d

Re: [PR] ci improvements [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13876: URL: https://github.com/apache/datafusion/pull/13876#discussion_r1896785019 ## .github/workflows/rust.yml: ## @@ -288,17 +318,20 @@ jobs: mv *.tbl ../datafusion/sqllogictest/test_files/tpch/data - name: Verify that benchmar

Re: [PR] Add substrait tpch round trip tests from sql query [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13888: URL: https://github.com/apache/datafusion/pull/13888#issuecomment-2561185054 > Lastly, is it convention that we would add #[ignore] to the cases that do not pass at the moment, and open those up as we fix bugs? I think that is a reasonable approach -- tha

Re: [PR] Add snapshot testing to CLI & set up AWS mock [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13672: URL: https://github.com/apache/datafusion/pull/13672#discussion_r1896793013 ## datafusion-cli/tests/cli_integration.rs: ## @@ -17,42 +17,223 @@ use std::process::Command; -use assert_cmd::prelude::{CommandCargoExt, OutputAssertExt}; -us

Re: [PR] Add substrait tpch round trip tests from sql query [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13888: URL: https://github.com/apache/datafusion/pull/13888#issuecomment-2561185596 (I really like the idea of merging the tests, even if they don't all pass, in one PR and then working on fixes to the tests as additional follow on PRs) -- This is an automated messa

Re: [I] inner join involving hive-partitioned parquet dataset and filters on LHS and RHS causes panic [datafusion]

2024-12-24 Thread via GitHub
jwimberl commented on issue #9797: URL: https://github.com/apache/datafusion/issues/9797#issuecomment-2561191678 Went back -- 43.1.0 is the earlier version where the issue is fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Different behavior in datafusion 35.0.0 in reading hive-partitioned parquet data [datafusion-python]

2024-12-24 Thread via GitHub
jwimberl commented on issue #579: URL: https://github.com/apache/datafusion-python/issues/579#issuecomment-2561193405 This issue reproduces using the newest datafusion release, 43.1.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Cannot create a `List` of `FixedSizedList` in SQL [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13819: URL: https://github.com/apache/datafusion/issues/13819#issuecomment-2561203320 > @alamb Sorry for the late response. Based on your discussion, I think making your example work with the behavior aligning with DuckDB (i.e. don't cast back to `FixedSizeList` on

Re: [I] inner join involving hive-partitioned parquet dataset and filters on LHS and RHS causes panic [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #9797: URL: https://github.com/apache/datafusion/issues/9797#issuecomment-2561209180 Thanks @jwimberl for double checking! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
jonahgao commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1896844656 ## datafusion/optimizer/src/optimizer.rs: ## @@ -384,9 +396,26 @@ impl Optimizer { // rule handles recursion itself None

Re: [PR] Minor: change visibility of hash join utils [datafusion]

2024-12-24 Thread via GitHub
ozankabak merged PR #13893: URL: https://github.com/apache/datafusion/pull/13893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-24 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2561318792 Thanks @Omega359 Opt-level is 3 by default for the release https://doc.rust-lang.org/cargo/reference/profiles.html#release which focus on maximum runtime speed, I think it is

Re: [I] Release DataFusion `44.0.0` [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13334: URL: https://github.com/apache/datafusion/issues/13334#issuecomment-2561496114 Ok, we are clear for πŸš€ from what I can tell -- the comet upgrade is working. I will merge the following PR - https://github.com/apache/datafusion/pull/13882 (thanks @xudong963 a

Re: [I] Test DataFusion 44.0.0 with Comet [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13835: URL: https://github.com/apache/datafusion/issues/13835#issuecomment-2561495768 Looks like we are good to go now: https://github.com/apache/datafusion-comet/pull/1154 https://github.com/user-attachments/assets/8854bf3f-f1cb-45f2-9081-132ef51f27ed"; />

Re: [I] Test DataFusion 44.0.0 with Comet [datafusion]

2024-12-24 Thread via GitHub
alamb closed issue #13835: Test DataFusion 44.0.0 with Comet URL: https://github.com/apache/datafusion/issues/13835 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] Downloading IMDB dataset for benchmarks gives 404 Not Found [datafusion]

2024-12-24 Thread via GitHub
alihan-synnada commented on issue #13896: URL: https://github.com/apache/datafusion/issues/13896#issuecomment-2561634482 @Spaarsh Nice find! It works for me too. I think you can open the PR and hopefully it will be merged without much delay. -- This is an automated message from the Apache

Re: [PR] Implement maintains_input_order for AggregateExec [datafusion]

2024-12-24 Thread via GitHub
ozankabak commented on PR #13897: URL: https://github.com/apache/datafusion/pull/13897#issuecomment-2561681712 If we add a unit test with an unnecessary `SortExec` on top of an order-maintaining `AggregateExec`, would the `EnforceSorting` rule remove it with this change? I guess it wouldn't

Re: [PR] Minor: Avoid emitting empty batches in partial sort [datafusion]

2024-12-24 Thread via GitHub
ozankabak merged PR #13895: URL: https://github.com/apache/datafusion/pull/13895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Minor: Avoid emitting empty batches in partial sort [datafusion]

2024-12-24 Thread via GitHub
ozankabak commented on PR #13895: URL: https://github.com/apache/datafusion/pull/13895#issuecomment-2561676893 This is already reviewed by 3 people and is a fairly easy change, so I will go ahead and merge it -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Fix visibility of `swap_hash_join` to be `pub` [datafusion]

2024-12-24 Thread via GitHub
berkaysynnada commented on PR #13899: URL: https://github.com/apache/datafusion/pull/13899#issuecomment-2561665022 Sorry for the inconvenience, I possibly tried to set the all functions at the same visibility. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Fix visibility of `swap_hash_join` to be `pub` [datafusion]

2024-12-24 Thread via GitHub
ozankabak commented on PR #13899: URL: https://github.com/apache/datafusion/pull/13899#issuecomment-2561664524 Thanks for the fast fix @alamb and sorry for the inconvenience -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Minor: change visibility of hash join utils [datafusion]

2024-12-24 Thread via GitHub
ozankabak commented on code in PR #13893: URL: https://github.com/apache/datafusion/pull/13893#discussion_r1897151601 ## datafusion/core/src/physical_optimizer/join_selection.rs: ## @@ -176,7 +176,7 @@ fn swap_join_projection( /// This function swaps the inputs of the given joi

Re: [PR] Default to ZSTD compression when writing Parquet [datafusion-python]

2024-12-24 Thread via GitHub
kylebarron commented on code in PR #981: URL: https://github.com/apache/datafusion-python/pull/981#discussion_r1896894738 ## python/datafusion/dataframe.py: ## @@ -620,16 +620,24 @@ def write_csv(self, path: str | pathlib.Path, with_header: bool = False) -> None def write_

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
wiedld commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1896931677 ## datafusion/optimizer/src/optimizer.rs: ## @@ -384,9 +396,26 @@ impl Optimizer { // rule handles recursion itself None =

Re: [PR] Fix `recursive-protection` feature flag [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13887: URL: https://github.com/apache/datafusion/pull/13887#issuecomment-2561452915 Thank you for the reviews @andygrove and @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Fix `recursive-protection` feature flag [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13887: URL: https://github.com/apache/datafusion/pull/13887#issuecomment-2561452851 > LGTM but I have not tested with Comet. Thanks @alamb I have tested so locally -- specifically I checked out the code in https://github.com/apache/datafusion-comet/pull/1154

Re: [PR] Fix `recursive-protection` feature flag [datafusion]

2024-12-24 Thread via GitHub
alamb merged PR #13887: URL: https://github.com/apache/datafusion/pull/13887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Making the `recursive` dependency an optional feature [datafusion]

2024-12-24 Thread via GitHub
alamb closed issue #13766: Making the `recursive` dependency an optional feature URL: https://github.com/apache/datafusion/issues/13766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Introduce LogicalPlan invariants, begin automatically checking them [datafusion]

2024-12-24 Thread via GitHub
jonahgao commented on code in PR #13651: URL: https://github.com/apache/datafusion/pull/13651#discussion_r1897045848 ## datafusion/sqllogictest/test_files/subquery.slt: ## @@ -433,17 +433,32 @@ logical_plan 08)--TableScan: t1 projection=[t1_int] #invalid_scalar_subqu

[PR] chore(deps): update sqllogictest requirement from 0.23.0 to 0.24.0 [datafusion]

2024-12-24 Thread via GitHub
xudong963 opened a new pull request, #13902: URL: https://github.com/apache/datafusion/pull/13902 ## Which issue does this PR close? Closes #https://github.com/apache/datafusion/pull/13884 ## Rationale for this change ## What changes are included in this P

Re: [PR] Adds support for pg DROP EXTENSION [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
ramnivas commented on code in PR #1610: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1610#discussion_r1896909214 ## src/parser/mod.rs: ## @@ -5861,6 +5863,22 @@ impl<'a> Parser<'a> { }) } +pub fn parse_drop_extension(&mut self) -> Result { +

Re: [PR] Adds support for pg DROP EXTENSION [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
ramnivas commented on code in PR #1610: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1610#discussion_r1896909521 ## tests/sqlparser_postgres.rs: ## @@ -662,6 +662,22 @@ fn parse_create_extension() { .verified_stmt("CREATE EXTENSION extension_name WITH SC

Re: [PR] Adds support for pg DROP EXTENSION [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
ramnivas commented on code in PR #1610: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1610#discussion_r1896909161 ## src/ast/mod.rs: ## @@ -2759,6 +2759,15 @@ pub enum Statement { version: Option, }, /// ```sql +/// DROP EXTENSION [ IF EXIST

[PR] Refactor advancing token to avoid duplication, avoid borrow checker issues [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb opened a new pull request, #1618: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1618 - Follow on to https://github.com/apache/datafusion-sqlparser-rs/pull/1587 - Part of https://github.com/apache/datafusion-sqlparser-rs/issues/1558 @davisp made some great improve

Re: [PR] Refactor advancing token to avoid duplication, avoid borrow checker issues [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on code in PR #1618: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1618#discussion_r1896991320 ## src/parser/mod.rs: ## @@ -3641,22 +3670,31 @@ impl<'a> Parser<'a> { #[must_use] pub fn parse_keyword(&mut self, expected: Keyword) -> bool {

Re: [PR] Improve parsing performance by reducing token cloning [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587#issuecomment-2561484355 Some follow on PRs: - https://github.com/apache/datafusion-sqlparser-rs/pull/1617 - https://github.com/apache/datafusion-sqlparser-rs/pull/1618 -- This is an automat

Re: [I] Consolidate Example: simplify_udaf_expression.rs into advanced_udaf.rs [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13842: URL: https://github.com/apache/datafusion/issues/13842#issuecomment-2561469759 > @alamb > > Sorry for the delayπŸ™ But I have a quick question! I’m not too sure about this, but should the return type of `AggregateFunctionSimplification` perhaps be `Resu

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2561469424 FWIW I don't think the size of hte datafusion-cli binary is all that critical per se (maybe we can adjust / optimize the size of what is distributed on homebrew) What I was

Re: [PR] Fix visibility of `swap_hash_join` to be `pub` [datafusion]

2024-12-24 Thread via GitHub
andygrove merged PR #13899: URL: https://github.com/apache/datafusion/pull/13899 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] `swap_hash_join` is no longer public so comet doesn't compile [datafusion]

2024-12-24 Thread via GitHub
andygrove closed issue #13898: `swap_hash_join` is no longer public so comet doesn't compile URL: https://github.com/apache/datafusion/issues/13898 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Improve parsing performance by reducing token cloning [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb commented on PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587#issuecomment-2561473115 Thanks again @davisp @Dandandan and @iffyio -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Fix visibility of `swap_hash_join` to be `pub` [datafusion]

2024-12-24 Thread via GitHub
alamb commented on PR #13899: URL: https://github.com/apache/datafusion/pull/13899#issuecomment-2561472992 Thanks @andygrove πŸ™ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Improve parsing performance by reducing token cloning [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb merged PR #1587: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1587 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[PR] Improve Parser documentation [datafusion-sqlparser-rs]

2024-12-24 Thread via GitHub
alamb opened a new pull request, #1617: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1617 - While reviewing https://github.com/apache/datafusion-sqlparser-rs/pull/1587 and preparing to improve documentation more it occurs to me the `Parser` struct could be documented better

[I] [substrait] customizable producer [datafusion]

2024-12-24 Thread via GitHub
vbarua opened a new issue, #13901: URL: https://github.com/apache/datafusion/issues/13901 ### Is your feature request related to a problem or challenge? The work in https://github.com/apache/datafusion/pull/13803 as part of https://github.com/apache/datafusion/issues/13318 introduced

Re: [I] [substrait] customizable producer [datafusion]

2024-12-24 Thread via GitHub
vbarua commented on issue #13901: URL: https://github.com/apache/datafusion/issues/13901#issuecomment-2561475829 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [substrait] Add support for ExtensionTable [datafusion]

2024-12-24 Thread via GitHub
vbarua commented on code in PR #13772: URL: https://github.com/apache/datafusion/pull/13772#discussion_r1896987187 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -438,6 +439,22 @@ pub trait SubstraitConsumer: Send + Sync + Sized { user_defined_literal.t

Re: [PR] [substrait] Add support for ExtensionTable [datafusion]

2024-12-24 Thread via GitHub
vbarua commented on code in PR #13772: URL: https://github.com/apache/datafusion/pull/13772#discussion_r1896987187 ## datafusion/substrait/src/logical_plan/consumer.rs: ## @@ -438,6 +439,22 @@ pub trait SubstraitConsumer: Send + Sync + Sized { user_defined_literal.t

Re: [PR] [substrait] Add support for ExtensionTable [datafusion]

2024-12-24 Thread via GitHub
vbarua commented on PR #13772: URL: https://github.com/apache/datafusion/pull/13772#issuecomment-2561479915 Left some feedback about the `consume_extension_table` API. > However, there are a few things that I'd like to revisit once we have a (symmetrical) customisable SubstraitProduce

[I] `swap_hash_join` is no longer public so comet doesn't compile [datafusion]

2024-12-24 Thread via GitHub
alamb opened a new issue, #13898: URL: https://github.com/apache/datafusion/issues/13898 ### Describe the bug - part of to #13835 When upgrading comet to use the latest datafusion pun (https://github.com/andygrove/datafusion-comet/pull/1) it fails to compile with the foll

Re: [I] Test DataFusion 44.0.0 with Comet [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13835: URL: https://github.com/apache/datafusion/issues/13835#issuecomment-2561457599 I found another issue when testing today: - https://github.com/apache/datafusion/issues/13898 -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Minor: change visibility of hash join utils [datafusion]

2024-12-24 Thread via GitHub
alamb commented on code in PR #13893: URL: https://github.com/apache/datafusion/pull/13893#discussion_r1896966577 ## datafusion/core/src/physical_optimizer/join_selection.rs: ## @@ -176,7 +176,7 @@ fn swap_join_projection( /// This function swaps the inputs of the given join op

[PR] Fix visibility of `swap_hash_join` [datafusion]

2024-12-24 Thread via GitHub
alamb opened a new pull request, #13899: URL: https://github.com/apache/datafusion/pull/13899 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/13898 - Fix a regression in visibility introduced in https://github.com/apache/datafusion/pull/13893

Re: [I] Release DataFusion `44.0.0` [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13334: URL: https://github.com/apache/datafusion/issues/13334#issuecomment-2561458685 Update, when preparing to make a final RC I found another issue: - https://github.com/apache/datafusion/issues/13898 I have a proposed fix here: - https://github.com/ap

  1   2   >