Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
zhuqi-lucas commented on code in PR #14569: URL: https://github.com/apache/datafusion/pull/14569#discussion_r1948599651 ## datafusion/physical-optimizer/src/enforce_sorting/mod.rs: ## @@ -373,9 +373,10 @@ pub fn ensure_sorting( return adjust_window_sort_removal(requirem

Re: [PR] feat: add hint for missing fields [datafusion]

2025-02-10 Thread via GitHub
eliaperantoni commented on code in PR #14521: URL: https://github.com/apache/datafusion/pull/14521#discussion_r1948557193 ## datafusion/sql/tests/cases/diagnostic.rs: ## @@ -201,14 +201,8 @@ fn test_ambiguous_reference() -> Result<()> { let diag = do_query(query); asse

Re: [I] Attach `Diagnostic` to "incompatible type in unary expression" error [datafusion]

2025-02-10 Thread via GitHub
eliaperantoni commented on issue #14433: URL: https://github.com/apache/datafusion/issues/14433#issuecomment-2647246100 Hey @alan910127, sorry for the delay, thank you so much for taking on this ticket! I suggest you look at `datafusion/sql/tests/cases/diagnostic.rs` and create a tes

Re: [I] Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
pmcgleenon commented on issue #14246: URL: https://github.com/apache/datafusion/issues/14246#issuecomment-2647256168 FYI the [Clickbench PR](https://github.com/ClickHouse/ClickBench/pull/304) has been merged and the latest Datafusion `45.0.0` results have been published on the site https://

[PR] chore(deps): bump strum from 0.26.3 to 0.27.0 [datafusion]

2025-02-10 Thread via GitHub
dependabot[bot] opened a new pull request, #14573: URL: https://github.com/apache/datafusion/pull/14573 Bumps [strum](https://github.com/Peternator7/strum) from 0.26.3 to 0.27.0. Release notes Sourced from https://github.com/Peternator7/strum/releases";>strum's releases. v0.

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
zhuqi-lucas commented on code in PR #14569: URL: https://github.com/apache/datafusion/pull/14569#discussion_r1948589383 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -1943,6 +1943,30 @@ async fn test_remove_unnecessary_spm1() -> Result<()> { Ok(())

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
zhuqi-lucas commented on code in PR #14569: URL: https://github.com/apache/datafusion/pull/14569#discussion_r1948589383 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -1943,6 +1943,30 @@ async fn test_remove_unnecessary_spm1() -> Result<()> { Ok(())

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
zhuqi-lucas commented on code in PR #14569: URL: https://github.com/apache/datafusion/pull/14569#discussion_r1948589383 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -1943,6 +1943,30 @@ async fn test_remove_unnecessary_spm1() -> Result<()> { Ok(())

Re: [PR] Drop RowConverter from GroupOrderingPartial [datafusion]

2025-02-10 Thread via GitHub
ctsk commented on PR #14566: URL: https://github.com/apache/datafusion/pull/14566#issuecomment-2647275498 ## Micro-benchmark results - `group_ordering_$n` benchmarks the partial group ordering where the ordering contains $n columns. Each Columns consists of 8192 Int32s. ```

[PR] bigquery supports group by cube/rollup etc. [datafusion-sqlparser-rs]

2025-02-10 Thread via GitHub
Groennbeck opened a new pull request, #1720: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1720 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[I] CometHashJoin always selects BuildRight which causes potential performance regression [datafusion-comet]

2025-02-10 Thread via GitHub
hayman42 opened a new issue, #1382: URL: https://github.com/apache/datafusion-comet/issues/1382 ### Describe the bug First of all, thank you guys for such a great project. I am currently doing some research to see if our team can make use of datafusion comet to our workload. A

[PR] Add SBT support for project [datafusion-comet]

2025-02-10 Thread via GitHub
EmilyMatt opened a new pull request, #1383: URL: https://github.com/apache/datafusion-comet/pull/1383 ## Which issue does this PR close? Closes #1344 ## Rationale for this change Should optimize incremental compilation and allow for much faster development and tests cyc

Re: [PR] minor: Move file compression to `datafusion-catalog-listing` [datafusion]

2025-02-10 Thread via GitHub
logan-keede commented on PR #14555: URL: https://github.com/apache/datafusion/pull/14555#issuecomment-2647881291 Thanks for the review, @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] chore(deps): bump strum from 0.26.3 to 0.27.0 [datafusion]

2025-02-10 Thread via GitHub
alamb merged PR #14573: URL: https://github.com/apache/datafusion/pull/14573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Minor: remove unnecessary dependencies in `datafusion-sqllogictest` [datafusion]

2025-02-10 Thread via GitHub
xudong963 commented on PR #14578: URL: https://github.com/apache/datafusion/pull/14578#issuecomment-2647931037 Thanks @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Minor: remove unnecessary dependencies in `datafusion-sqllogictest` [datafusion]

2025-02-10 Thread via GitHub
xudong963 merged PR #14578: URL: https://github.com/apache/datafusion/pull/14578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2647942928 Ah, I see the snmalloc comes from running ``` docs/source/user-guide/crate-configuration.md ``` I think the issue there is that snmalloc takes over the global alloca

Re: [I] [DISCUSSION] 2025 Q1-Q2 Roadmap [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14580: URL: https://github.com/apache/datafusion/issues/14580#issuecomment-2647987276 I personally plan to focus on - 🚀 performance: Help complete the advanced parquet predicate pushdown with @XiangpengHao https://github.com/apache/datafusion/issues/3463 - 🔨

Re: [PR] equivalence classes: use normalized mapping for projection [datafusion]

2025-02-10 Thread via GitHub
askalt commented on PR #14327: URL: https://github.com/apache/datafusion/pull/14327#issuecomment-2647985607 @berkaysynnada Could you check please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2648007180 I plan to merge this PR in once the tests pass -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] feat: add hint for missing fields [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14521: URL: https://github.com/apache/datafusion/pull/14521#issuecomment-2648008182 Thanks everyone! This is epic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] feat: add hint for missing fields [datafusion]

2025-02-10 Thread via GitHub
alamb merged PR #14521: URL: https://github.com/apache/datafusion/pull/14521 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Fix Float and Decimal coercion [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14273: URL: https://github.com/apache/datafusion/pull/14273#issuecomment-2648142854 Marking as a draft as I don't think this is waiting for review anymore (nor have we figured out a consensus either) -- This is an automated message from the Apache Git Service. To re

Re: [PR] Fix Float and Decimal coercion [datafusion]

2025-02-10 Thread via GitHub
andygrove commented on code in PR #14273: URL: https://github.com/apache/datafusion/pull/14273#discussion_r1949145745 ## datafusion/sqllogictest/test_files/tpch/plans/q6.slt.part: ## @@ -31,13 +31,13 @@ logical_plan 01)Projection: sum(lineitem.l_extendedprice * lineitem.l_disco

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
xudong963 commented on PR #14569: URL: https://github.com/apache/datafusion/pull/14569#issuecomment-2648148207 > This makes sense to me -- thank you @xudong963 > > In general it seems like we have a class of bugs related to removing `fetch` -- maybe we should re-evaluate `enforce_sort

Re: [PR] Fix Float and Decimal coercion [datafusion]

2025-02-10 Thread via GitHub
findepi commented on code in PR #14273: URL: https://github.com/apache/datafusion/pull/14273#discussion_r1949089345 ## datafusion/sqllogictest/test_files/tpch/plans/q6.slt.part: ## @@ -31,13 +31,13 @@ logical_plan 01)Projection: sum(lineitem.l_extendedprice * lineitem.l_discoun

Re: [I] Official Docker Image is not found [datafusion-ballista]

2025-02-10 Thread via GitHub
milenkovicm commented on issue #1178: URL: https://github.com/apache/datafusion-ballista/issues/1178#issuecomment-2648167946 I believe this is duplicate of #1044 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Create UNION plan node with correct schema [datafusion]

2025-02-10 Thread via GitHub
jonahgao commented on issue #14380: URL: https://github.com/apache/datafusion/issues/14380#issuecomment-2648181791 > Should we move `TypeCoercion` into builder 🤔 ? I also thought about that, but some users do not want the casts introduced by `TypeCoercion`. See https://github.co

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
xudong963 merged PR #14569: URL: https://github.com/apache/datafusion/pull/14569 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Fix: limit is missing after removing SPM [datafusion]

2025-02-10 Thread via GitHub
xudong963 commented on PR #14569: URL: https://github.com/apache/datafusion/pull/14569#issuecomment-2648194193 Thanks all, let's go! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] Current Ballista release is broken? [datafusion-ballista]

2025-02-10 Thread via GitHub
milenkovicm commented on issue #1179: URL: https://github.com/apache/datafusion-ballista/issues/1179#issuecomment-2648199437 `./dev/build-ballista-docker.sh` works with latest master, please use latest master, we can't really provide support for older branches -- This is an automated me

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2648317619 > [@alamb](https://github.com/alamb) I'll also do some updates in the issue summary. > > Considering that this is the first time I've been involved in this process, could y

Re: [I] Hide boilerplate in documentation examples [datafusion]

2025-02-10 Thread via GitHub
alamb closed issue #14557: Hide boilerplate in documentation examples URL: https://github.com/apache/datafusion/issues/14557 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] ListingTable cannot handle partition evolution [datafusion]

2025-02-10 Thread via GitHub
logan-keede commented on issue #13270: URL: https://github.com/apache/datafusion/issues/13270#issuecomment-2648393486 @adriangb my focus has been on refactoring `FileScanConfig` to move it out of core. I cant say I understand the internals that much, but I will look into it and mention it h

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2648403884 🚀 Thanks again @ugoa -- this is great work FYI @tshauck who I think started this documentation many months ago -- This is an automated message from the Apache Git Serv

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb merged PR #14544: URL: https://github.com/apache/datafusion/pull/14544 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Run / Test all examples in Documentation [datafusion]

2025-02-10 Thread via GitHub
alamb closed issue #14435: Run / Test all examples in Documentation URL: https://github.com/apache/datafusion/issues/14435 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2025-02-10 Thread via GitHub
findepi closed issue #11513: [Proposal] Decouple logical from physical types URL: https://github.com/apache/datafusion/issues/11513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] [Proposal] Decouple logical from physical types [datafusion]

2025-02-10 Thread via GitHub
findepi commented on issue #11513: URL: https://github.com/apache/datafusion/issues/11513#issuecomment-2648438472 Discussion continues under https://github.com/apache/datafusion/issues/12622. Let me close this one. I don't think there is a benefit of keeping two issues open for single topic

[PR] fix: case-sensitive quoted identifiers in DELETE statements [datafusion]

2025-02-10 Thread via GitHub
nantunes opened a new pull request, #14584: URL: https://github.com/apache/datafusion/pull/14584 ## Which issue does this PR close? - Closes #14583. ## Rationale for this change When executing DELETE statements with case-sensitive quoted table names, the table name was i

[PR] Move FileSinkConfig out of Core [datafusion]

2025-02-10 Thread via GitHub
logan-keede opened a new pull request, #14585: URL: https://github.com/apache/datafusion/pull/14585 ## Which issue does this PR close? - Part of #1. ## Rationale for this change ## What changes are included in this PR? Refactor of `FileSinkConfig` and

Re: [PR] Add xxhash algorithms in SQL and expression api [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2648334436 > @alamb could you also get a clarification over the inclusion of wyhash functions as well? It was also requested under the same issue [here](https://github.com/apache/datafusion/issue

Re: [PR] Add xxhash algorithms in SQL and expression api [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2648336196 So unless there are many users (and ideally maintainers) of new functions, I am hesitant to add new ones -- This is an automated message from the Apache Git Service. To respond to th

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2648320294 @xudong963 when would you like to start making the release? Maybe we should targe the week of Feb 24 🤔 -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
lgingerich commented on PR #14525: URL: https://github.com/apache/datafusion/pull/14525#issuecomment-2648325869 @alamb Should be fixed now. I just didn't have the language of the test properly matching this new function. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
ugoa commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2648417980 My pleasure! I learned a lot during the time as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[I] DELETE statement fails to preserve quoted identifiers [datafusion]

2025-02-10 Thread via GitHub
nantunes opened a new issue, #14583: URL: https://github.com/apache/datafusion/issues/14583 ### Describe the bug When executing a DELETE statement with a case-sensitive quoted table name, the table name is incorrectly normalized to lowercase, causing "table not found" errors.

Re: [I] Extended tests are (still) failing on main [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14576: URL: https://github.com/apache/datafusion/issues/14576#issuecomment-2647795542 It would be amazing to be able to run these tests prior to merge to main, see - https://github.com/apache/datafusion/issues/14319 That might be what I try and help along f

Re: [I] Document PREPARE statements [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #13570: URL: https://github.com/apache/datafusion/issues/13570#issuecomment-2647797989 Thanks @dhegberg -- I am not sure if that is desired or not -- maybe it is worth a ticket -- This is an automated message from the Apache Git Service. To respond to the message

[PR] initial commit for scalar udf in ffi crate [datafusion]

2025-02-10 Thread via GitHub
timsaucer opened a new pull request, #14579: URL: https://github.com/apache/datafusion/pull/14579 ## Which issue does this PR close? Addresses part of https://github.com/apache/datafusion/issues/14562 - specifically scalar udfs ## Rationale for this change This is a pure

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on code in PR #14544: URL: https://github.com/apache/datafusion/pull/14544#discussion_r1949026721 ## docs/rustdoc_trim.py: ## @@ -0,0 +1,72 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE

Re: [I] Extended tests are failing on main [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14549: URL: https://github.com/apache/datafusion/issues/14549#issuecomment-2647794572 > Extended tests are still failing Thanks. Yeah it seems they worked for one commit ![Image](https://github.com/user-attachments/assets/a7159d97-6188-4b05-94ea-db2768e713

[I] Extended tests are (still) failing on main [datafusion]

2025-02-10 Thread via GitHub
alamb opened a new issue, #14576: URL: https://github.com/apache/datafusion/issues/14576 ### Describe the bug - Follow on on https://github.com/apache/datafusion/issues/14549 The extended tests are failing again Here is an example: https://github.com/apache/datafusion/ac

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
ugoa closed pull request #14544: Test all examples from library-user-guide & user-guide docs URL: https://github.com/apache/datafusion/pull/14544 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] minor: Move file compression [datafusion]

2025-02-10 Thread via GitHub
alamb commented on code in PR #14555: URL: https://github.com/apache/datafusion/pull/14555#discussion_r1948945841 ## datafusion/core/Cargo.toml: ## @@ -43,7 +43,7 @@ array_expressions = ["nested_expressions"] # Used to enable the avro format avro = ["apache-avro", "num-traits"

Re: [PR] minor: Move file compression to `datafusion-catalog-listing` [datafusion]

2025-02-10 Thread via GitHub
alamb merged PR #14555: URL: https://github.com/apache/datafusion/pull/14555 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2647984070 Making the examples in the doc rendred with `sql` rather than ``` was a great idea. However, since those files are automatically generated from the source code we need to update the ge

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2648205592 So close... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[PR] Add guideline for GSoC 2025 applicants under Contributor Guide [datafusion]

2025-02-10 Thread via GitHub
oznur-synnada opened a new pull request, #14582: URL: https://github.com/apache/datafusion/pull/14582 ## Which issue does this PR close? ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested?

Re: [PR] Implement predicate pruning for not like expressions [datafusion]

2025-02-10 Thread via GitHub
adriangb commented on code in PR #14567: URL: https://github.com/apache/datafusion/pull/14567#discussion_r1949194705 ## datafusion/physical-optimizer/src/pruning.rs: ## @@ -1710,6 +1717,56 @@ fn build_like_match( Some(combined) } +// For predicate `col NOT LIKE 'foo%'`,

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2648221037 > Since we're planning of having a separate mode for spark wherein a user can access all spark functions and also not make the main code dependent on this crate, I was thinking if t

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
alamb commented on code in PR #14525: URL: https://github.com/apache/datafusion/pull/14525#discussion_r1949200715 ## datafusion/expr/src/test/function_stub.rs: ## @@ -125,9 +125,7 @@ impl AggregateUDFImpl for Sum { } fn coerce_types(&self, arg_types: &[DataType]) ->

Re: [I] ListingTable cannot handle partition evolution [datafusion]

2025-02-10 Thread via GitHub
adriangb commented on issue #13270: URL: https://github.com/apache/datafusion/issues/13270#issuecomment-2648246475 @logan-keede I see you're doing some work on `FileScanConfig`. Would it be relevant to consider what needs to be changed to fix this? -- This is an automated message from the

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14525: URL: https://github.com/apache/datafusion/pull/14525#issuecomment-2648250092 Thank you so much @lgingerich I started the CI checks FYI When I ran some of the tests locally I saw some failures which seem to be related: ``` cargo test

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14525: URL: https://github.com/apache/datafusion/pull/14525#issuecomment-2648247269 FYI @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add xxhash algorithms in SQL and expression api [datafusion]

2025-02-10 Thread via GitHub
Spaarsh commented on PR #14367: URL: https://github.com/apache/datafusion/pull/14367#issuecomment-2648257379 @alamb could you also get a clarification over the inclusion of wyhash functions as well? It was also requested under the same issue [here](https://github.com/apache/datafusion/issue

Re: [I] [DISCUSSION] Add separate crate to cover spark builtin functions [datafusion]

2025-02-10 Thread via GitHub
Spaarsh commented on issue #5600: URL: https://github.com/apache/datafusion/issues/5600#issuecomment-2648452637 > I think that would be ok (maybe implement this in datafusion-cli) I'm sorry if I'm got this wrong, you're suggesting that we could make an import command in the datafusion

Re: [PR] refactor: Move FileSinkConfig out of Core [datafusion]

2025-02-10 Thread via GitHub
logan-keede commented on PR #14585: URL: https://github.com/apache/datafusion/pull/14585#issuecomment-2648480794 cc @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
lgingerich commented on PR #14525: URL: https://github.com/apache/datafusion/pull/14525#issuecomment-2648489071 I'll check out these new errors later today. Is there a simple way to run all these tests locally without manually running each of the difference cargo test combinations? (I

Re: [PR] Use ` take_function_args` in more places [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14525: URL: https://github.com/apache/datafusion/pull/14525#issuecomment-2648504081 > I'll check out these new errors later today. > > Is there a simple way to run all these tests locally without manually running each of the difference cargo test combinations? (

[PR] Add support for PostgreSQL and Redshift geometric operators [datafusion-sqlparser-rs]

2025-02-10 Thread via GitHub
benrsatori opened a new pull request, #1721: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1721 Add support for PostgreSQL and Redshift geometric operators, enabling calculations and comparisons for geometric data types like points, lines, and polygons.

[I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
alamb opened a new issue, #14586: URL: https://github.com/apache/datafusion/issues/14586 ### Is your feature request related to a problem or challenge? The [ClickBench Benchmark](https://benchmark.clickhouse.com/) measures the performance of filtering and aggregation Being on t

[I] Custom CLI Mode with Manual ```import``` for Functions [datafusion]

2025-02-10 Thread via GitHub
Spaarsh opened a new issue, #14588: URL: https://github.com/apache/datafusion/issues/14588 ### Is your feature request related to a problem or challenge? As we increase the number of functions in our core, it might lead to an increased runtime footprint for datafusion-cli in the futur

Re: [I] Custom CLI Mode with Manual ```import``` for Functions [datafusion]

2025-02-10 Thread via GitHub
Spaarsh commented on issue #14588: URL: https://github.com/apache/datafusion/issues/14588#issuecomment-2648658163 I am willing to work on this issue once it is validated by the community. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Question: `to_char(date, timstamp format)` [datafusion]

2025-02-10 Thread via GitHub
Omega359 commented on issue #14536: URL: https://github.com/apache/datafusion/issues/14536#issuecomment-2648671927 > I had the impression (although perhaps it is dated) that datafusion sought to be compatible with postgres to the extent reasonable. Assuming thats still the case is there a r

Re: [I] Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14246: URL: https://github.com/apache/datafusion/issues/14246#issuecomment-2648548383 Filed a ticket for running this on 46 - https://github.com/apache/datafusion/issues/14587 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
Rachelint commented on issue #14586: URL: https://github.com/apache/datafusion/issues/14586#issuecomment-2648625729 > > I am trying a poc about support block approach by only modifying codes of group values(we also need to modifying codes of GroupAccumulatortoo in [#11943](https://github.co

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-02-10 Thread via GitHub
alamb commented on PR #14331: URL: https://github.com/apache/datafusion/pull/14331#issuecomment-2648628261 I merged this test up from main -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Document PREPARE statements [datafusion]

2025-02-10 Thread via GitHub
dhegberg commented on issue #13570: URL: https://github.com/apache/datafusion/issues/13570#issuecomment-2648541644 I think I'll post my documentation change without the named arguments. I'll then take a stab as adding support for named arguments with EXECUTE. -- This is an automated mess

Re: [I] Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14246: URL: https://github.com/apache/datafusion/issues/14246#issuecomment-2648540431 > Nice! Looks we have some more competition now from DuckDB:... - Filed https://github.com/apache/datafusion/issues/14586 to start up the optimization machine -- This is

Re: [PR] Add support for PostgreSQL and Redshift geometric operators [datafusion-sqlparser-rs]

2025-02-10 Thread via GitHub
benrsatori closed pull request #1721: Add support for PostgreSQL and Redshift geometric operators URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14586: URL: https://github.com/apache/datafusion/issues/14586#issuecomment-2648610440 > I am trying a poc about support block approach by only modifying codes of group values(we also need to modifying codes of GroupAccumulatortoo in https://github.com/apache/datafu

Re: [I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14586: URL: https://github.com/apache/datafusion/issues/14586#issuecomment-2648588773 I took a brief look at [some results](https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQWxsb3lEQiI6ZmFsc2UsIkFsbG95REIgKHR1bmVkKSI6ZmFsc2UsIkF0aGVuYSAocGFydGl0aW9uZWQpIjpmYWxzZSwi

Re: [I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
Rachelint commented on issue #14586: URL: https://github.com/apache/datafusion/issues/14586#issuecomment-2648602542 A low hanging fruit #13617, i plan to finish it in this week. And maybe it is time to push #11943 forward... I am trying a poc about support `block approach` by `

Re: [I] Update ClickBench benchmarks with DataFusion `44.0.0` [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #13983: URL: https://github.com/apache/datafusion/issues/13983#issuecomment-2648602300 Discussion about making more improvements: - https://the-asf.slack.com/archives/C04RJ0C85UZ/p1739204225620989 -- This is an automated message from the Apache Git Service. To r

[I] Update ClickBench benchmarks with DataFusion `46.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
alamb opened a new issue, #14587: URL: https://github.com/apache/datafusion/issues/14587 ### Is your feature request related to a problem or challenge? ### Is your feature request related to a problem or challenge? - Follow on to https://github.com/apache/datafusion/issues/13983

Re: [I] Document PREPARE statements [datafusion]

2025-02-10 Thread via GitHub
dhegberg commented on issue #13570: URL: https://github.com/apache/datafusion/issues/13570#issuecomment-2648544296 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] [EPIC] ClickBench Improvements (Vanity Benchmark) [datafusion]

2025-02-10 Thread via GitHub
Rachelint commented on issue #14586: URL: https://github.com/apache/datafusion/issues/14586#issuecomment-2648640934 For optimizer side, I suspect if `single_distinct_to_groupby` can really improve performance in current version? -- This is an automated message from the Apache Git Service.

Re: [I] Extended tests are (still) failing on main [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14576: URL: https://github.com/apache/datafusion/issues/14576#issuecomment-2648636639 I noticed that the github runner generated several warnings about diskspace https://github.com/user-attachments/assets/918fb3a4-1149-41b7-9027-e62721ac8800"; /> --

Re: [PR] Fix Float and Decimal coercion [datafusion]

2025-02-10 Thread via GitHub
andygrove commented on code in PR #14273: URL: https://github.com/apache/datafusion/pull/14273#discussion_r1949552271 ## datafusion/sqllogictest/test_files/tpch/plans/q6.slt.part: ## @@ -31,13 +31,13 @@ logical_plan 01)Projection: sum(lineitem.l_extendedprice * lineitem.l_disco

Re: [PR] fix: disable checking for uint_8 and uint_16 if complex type readers are enabled [datafusion-comet]

2025-02-10 Thread via GitHub
parthchandra commented on PR #1376: URL: https://github.com/apache/datafusion-comet/pull/1376#issuecomment-2648702671 @andygrove updates this to fallback, updated the unit tests and removed the draft tag -- This is an automated message from the Apache Git Service. To respond to the mess

[PR] minor: check size overflow before string repeat build [datafusion]

2025-02-10 Thread via GitHub
wForget opened a new pull request, #14575: URL: https://github.com/apache/datafusion/pull/14575 ## Which issue does this PR close? minor fix ## Rationale for this change Check string size overflow before string repeat build to fail fast and save memory. ## What ch

Re: [I] Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
alamb closed issue #14246: Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) URL: https://github.com/apache/datafusion/issues/14246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Update ClickBench benchmarks with DataFusion `45.0.0` (When Published) [datafusion]

2025-02-10 Thread via GitHub
alamb commented on issue #14246: URL: https://github.com/apache/datafusion/issues/14246#issuecomment-2647733813 > FYI the [Clickbench PR](https://github.com/ClickHouse/ClickBench/pull/304) has been merged and the latest Datafusion `45.0.0` results have been published on the site https://ben

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-02-10 Thread via GitHub
Groennbeck commented on PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#issuecomment-2647707391 > Hi @Groennbeck are you still planning on update this PR? Hey! sorry has been busy. I can have a look this week. -- This is an automated message from the Apache Git

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
ugoa commented on PR #14544: URL: https://github.com/apache/datafusion/pull/14544#issuecomment-2647808359 The `cmake` needs to be installed in setup-builder because it is required by `snmalloc-rs = "0.3"` -- This is an automated message from the Apache Git Service. To respond to the messa

[I] Apache DataFusion Google Summer of Code (GSoC) Application Guidelines [datafusion]

2025-02-10 Thread via GitHub
ozankabak opened a new issue, #14577: URL: https://github.com/apache/datafusion/issues/14577 ## Introduction Welcome to the Apache DataFusion Google Summer of Code (GSoC) application guidelines. We are excited to support contributors who are passionate about open-source data processi

Re: [PR] feat: [wip] experimental fuzz testing in test suite [datafusion-comet]

2025-02-10 Thread via GitHub
parthchandra commented on code in PR #1374: URL: https://github.com/apache/datafusion-comet/pull/1374#discussion_r1948989419 ## spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala: ## @@ -116,12 +116,49 @@ abstract class CometTestBase require(absTol > 0 && absTol

Re: [PR] Test all examples from library-user-guide & user-guide docs [datafusion]

2025-02-10 Thread via GitHub
alamb commented on code in PR #14544: URL: https://github.com/apache/datafusion/pull/14544#discussion_r1948995913 ## datafusion/core/Cargo.toml: ## @@ -152,6 +154,8 @@ serde_json = { workspace = true } sysinfo = "0.33.1" test-utils = { path = "../../test-utils" } tokio = { wo

Re: [I] Make it easier to use rust DataFusion UDFs in datafusion-python [datafusion-python]

2025-02-10 Thread via GitHub
timsaucer commented on issue #1017: URL: https://github.com/apache/datafusion-python/issues/1017#issuecomment-2647858283 @Spaarsh I've put up a *draft* PR for the scalar udf, but it has a few points that need cleaning up still: https://github.com/apache/datafusion/pull/14579 One thi

  1   2   3   >