Re: [PR] support simple/cross lateral joins [datafusion]

2025-05-13 Thread via GitHub
jayzhan211 commented on code in PR #16015: URL: https://github.com/apache/datafusion/pull/16015#discussion_r2086582165 ## datafusion/optimizer/src/decorrelate_lateral_join.rs: ## @@ -0,0 +1,142 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [PR] feat: Improve performance tracing feature [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove commented on code in PR #1730: URL: https://github.com/apache/datafusion-comet/pull/1730#discussion_r2086704226 ## native/core/src/execution/shuffle/shuffle_writer.rs: ## @@ -625,22 +626,24 @@ impl MultiPartitionShuffleRepartitioner { return Ok(());

Re: [PR] feat: Improve performance tracing feature [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove commented on code in PR #1730: URL: https://github.com/apache/datafusion-comet/pull/1730#discussion_r2086717495 ## native/core/src/execution/shuffle/shuffle_writer.rs: ## @@ -650,96 +653,97 @@ impl ShufflePartitioner for MultiPartitionShuffleRepartitioner { /// T

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-05-13 Thread via GitHub
ion-elgreco commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r2086999273 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[I] Improve error message format for `TrackConsumersPool` [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 opened a new issue, #16040: URL: https://github.com/apache/datafusion/issues/16040 ### Is your feature request related to a problem or challenge? When running an external query with `TrackConsumersPool`, when the memory pool runs out of memory and the error is unrecoverable

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-05-13 Thread via GitHub
ion-elgreco commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r2086999273 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[PR] docs: Add note on setting `core.abbrev` when generating diffs [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove opened a new pull request, #1735: URL: https://github.com/apache/datafusion-comet/pull/1735 ## Which issue does this PR close? N/A ## Rationale for this change Improve docs. ## What changes are included in this PR? ## How ar

Re: [PR] implement pretty-printing with `{:#}` [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on PR #1847: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1847#issuecomment-2876743720 Great, thank you @alamb ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] docs: Add note on setting `core.abbrev` when generating diffs [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove commented on code in PR #1735: URL: https://github.com/apache/datafusion-comet/pull/1735#discussion_r2087004311 ## docs/source/contributor-guide/spark-sql-tests.md: ## @@ -117,12 +118,16 @@ wiggle --replace ./sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.

Re: [PR] docs: Add note on setting `core.abbrev` when generating diffs [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove commented on PR #1735: URL: https://github.com/apache/datafusion-comet/pull/1735#issuecomment-2876810438 @mbutrovich could you review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] [ignore] Enable DPP Spark SQL tests [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove opened a new pull request, #1734: URL: https://github.com/apache/datafusion-comet/pull/1734 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [I] Improve error message format for `TrackConsumersPool` [datafusion]

2025-05-13 Thread via GitHub
ding-young commented on issue #16040: URL: https://github.com/apache/datafusion/issues/16040#issuecomment-2876828365 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Improve error message format for `TrackConsumersPool` [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 commented on issue #16040: URL: https://github.com/apache/datafusion/issues/16040#issuecomment-2876825294 @ding-young could you `take` this task? I am unable to assign for some reason. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] fix: Support Schema Evolution in iceberg [datafusion-comet]

2025-05-13 Thread via GitHub
huaxingao commented on PR #1723: URL: https://github.com/apache/datafusion-comet/pull/1723#issuecomment-2876826410 > The config CometConf.COMET_SCHEMA_EVOLUTION_ENABLED is valid for Parquet files as well so removing it is not correct imo. I have added back `COMET_SCHEMA_EVOLUTION_ENA

Re: [PR] feat: metadata handling for aggregates and window functions [datafusion]

2025-05-13 Thread via GitHub
alamb commented on PR #15911: URL: https://github.com/apache/datafusion/pull/15911#issuecomment-2876838451 🤖 `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~

Re: [PR] Add late pruning of file based on file level statistics [datafusion]

2025-05-13 Thread via GitHub
adriangb commented on PR #16014: URL: https://github.com/apache/datafusion/pull/16014#issuecomment-2876840440 @alamb I pushed 4607643 which adds some nice APIs for partition values. In particular I think it's important to have a way to prune based on partition values + file level statistics

Re: [PR] fix: stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators [datafusion]

2025-05-13 Thread via GitHub
fmonjalet commented on PR #16031: URL: https://github.com/apache/datafusion/pull/16031#issuecomment-2876849092 @jayzhan211 After looking into it, generating the plan programatically may become very unwieldy (very verbose). Is your concern about the size of the file checked into git, or the

[I] Use human-readable byte sizes in `explain` [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 opened a new issue, #16041: URL: https://github.com/apache/datafusion/issues/16041 ### Is your feature request related to a problem or challenge? When running a spilling query, the `explain` will display metrics like `spilled_bytes` ```sh # Run datafusion-cli with

[PR] Docs: Add example of creating a field in `return_field_from_args` [datafusion]

2025-05-13 Thread via GitHub
alamb opened a new pull request, #16039: URL: https://github.com/apache/datafusion/pull/16039 ## Which issue does this PR close? - Follow on to https://github.com/apache/datafusion/pull/15646 ## Rationale for this change I want to make it as easy as possible f

[PR] Add support for INCLUDE/EXCLUDE NULLS for UNPIVOT [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
Vedin opened a new pull request, #1849: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1849 UNPIVOT structure is changed from 2023 and now supports INCLUDE or EXCLUDE NULLS. Full syntax now looks like: ``` SELECT ... FROM ... UNPIVOT [ { INCLUDE | EXCLUDE } NU

Re: [I] Release Comet 0.8.0 [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove closed issue #1635: Release Comet 0.8.0 URL: https://github.com/apache/datafusion-comet/issues/1635 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [I] [feature] allow pretty-printing [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb closed issue #1845: [feature] allow pretty-printing URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1845 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Release Comet 0.8.0 [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove commented on issue #1635: URL: https://github.com/apache/datafusion-comet/issues/1635#issuecomment-2876553280 Comet 0.8.0 has now been released -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] feat: add macros for DataFusionError variants [datafusion]

2025-05-13 Thread via GitHub
Chen-Yuan-Lai commented on code in PR #15946: URL: https://github.com/apache/datafusion/pull/15946#discussion_r2086323971 ## datafusion/common/src/error.rs: ## @@ -655,6 +671,20 @@ impl DataFusionError { queue.push_back(self); ErrorIterator { queue } } + +

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-05-13 Thread via GitHub
alamb commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r2086779525 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [I] Linear Aggregate Functions Optimization [datafusion]

2025-05-13 Thread via GitHub
Rachelint commented on issue #15633: URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2875454485 > But the inverse optimization is possible It is really an interesting idea for me, I have marked it as a possible following work. > There might be cases also whe

Re: [I] Linear Aggregate Functions Optimization [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada commented on issue #15633: URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2875477677 > Sorry, I am still not so clear about these cases ... is it ok to share more details? I mean v2 might be faster in the following case maybe ``` // v1 S

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-13 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2875480275 @kosiew did you update the code or just add a test? I'm getting the same error ``` Error fetching table metadata: Failed to collect data frame results: Shared(ArrowErr

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-05-13 Thread via GitHub
16pierre commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r2086641885 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-05-13 Thread via GitHub
16pierre commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r2086641885 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,238 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] chore(deps): bump sqllogictest from 0.28.1 to 0.28.2 [datafusion]

2025-05-13 Thread via GitHub
xudong963 merged PR #16037: URL: https://github.com/apache/datafusion/pull/16037 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] implement `AggregateExec.partition_statistics` [datafusion]

2025-05-13 Thread via GitHub
xudong963 commented on PR #15954: URL: https://github.com/apache/datafusion/pull/15954#issuecomment-2876248152 > @xudong963 Thanks for reviewing. All comments have been addressed, PTAL Thank you, I'll review asap -- This is an automated message from the Apache Git Service. To respon

Re: [PR] refactor: remove deprecated `MemoryExec` [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada merged PR #16007: URL: https://github.com/apache/datafusion/pull/16007 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[I] Track peak memory usage in `SortExec` [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 opened a new issue, #16042: URL: https://github.com/apache/datafusion/issues/16042 ### Is your feature request related to a problem or challenge? Profiling the peak memory usage of blocking operators can be helpful.Aggregation and SortMergeJoin has already implemented it:

Re: [I] Use human-readable byte sizes in `explain` [datafusion]

2025-05-13 Thread via GitHub
ding-young commented on issue #16041: URL: https://github.com/apache/datafusion/issues/16041#issuecomment-2876887755 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [D] DISCUSSION: May 27, 2025 DataFusion Meetup in Amsterdam [datafusion]

2025-05-13 Thread via GitHub
GitHub user alamb added a comment to the discussion: DISCUSSION: May 27, 2025 DataFusion Meetup in Amsterdam @Dandandan is local, perhaps he knows of some locals who might be interested GitHub link: https://github.com/apache/datafusion/discussions/16038#discussioncomment-13132786 Thi

Re: [PR] feat: add macros for DataFusionError variants [datafusion]

2025-05-13 Thread via GitHub
Chen-Yuan-Lai commented on code in PR #15946: URL: https://github.com/apache/datafusion/pull/15946#discussion_r2086323971 ## datafusion/common/src/error.rs: ## @@ -655,6 +671,20 @@ impl DataFusionError { queue.push_back(self); ErrorIterator { queue } } + +

Re: [PR] implement pretty-printing with `{:#}` [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on PR #1847: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1847#issuecomment-2875568296 I'd love if we could merge this one and then I'll follow up with improvements -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [I] Linear Aggregate Functions Optimization [datafusion]

2025-05-13 Thread via GitHub
Rachelint commented on issue #15633: URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2875583097 > I mean v2 might be faster in the following case maybe Thanks, I am still not so familiar with `first/last`, but it seems make sense after studying some about it.

[PR] chore(deps): bump sqllogictest from 0.28.1 to 0.28.2 [datafusion]

2025-05-13 Thread via GitHub
dependabot[bot] opened a new pull request, #16037: URL: https://github.com/apache/datafusion/pull/16037 Bumps [sqllogictest](https://github.com/risinglightdb/sqllogictest-rs) from 0.28.1 to 0.28.2. Release notes Sourced from https://github.com/risinglightdb/sqllogictest-rs/releases

Re: [PR] implement pretty-printing with `{:#}` [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb commented on PR #1847: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1847#issuecomment-2876519529 I think this is pretty non controversial so let's merge it. @iffyio let us know if you would like any changes -- This is an automated message from the Apache Git Service

Re: [PR] implement pretty-printing with `{:#}` [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb merged PR #1847: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Add note to upgrade guide for removal of `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada merged PR #16034: URL: https://github.com/apache/datafusion/pull/16034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-05-13 Thread via GitHub
TheBuilderJR commented on PR #15295: URL: https://github.com/apache/datafusion/pull/15295#issuecomment-2875527091 @kosiew in case it's helpful I ended up building your datafusion fork (on branch schema-adapter) locally so you can get more useful stack traces. See below: ``` Error

Re: [PR] refactor: remove deprecated `ArrowExec` [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada merged PR #16006: URL: https://github.com/apache/datafusion/pull/16006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Linear Aggregate Functions Optimization [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada commented on issue #15633: URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2875378834 ``` // Origin (faster) SELECT aggr(a + b + c) FROM t GROUP BY d; ``` this has a cost of = 2 * C_sum * N + C_agg * N ``` // After converting (slower

Re: [PR] refactor: remove deprecated `JsonExec` [datafusion]

2025-05-13 Thread via GitHub
berkaysynnada merged PR #16005: URL: https://github.com/apache/datafusion/pull/16005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [D] DISCUSSION: January 2025 DataFusion Meetup in Amsterdam / CIDR 2025 [datafusion]

2025-05-13 Thread via GitHub
GitHub user oznur-synnada closed the discussion with a comment: DISCUSSION: January 2025 DataFusion Meetup in Amsterdam / CIDR 2025 > Hi all, thanks for organising the meetup back in January. For next time, my > company Adyen would be happy to host. We have a venue that’s enough for ~50 > an

[I] Add pretty printing to more sql constructs [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa opened a new issue, #1850: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1850 This is a followup on https://github.com/apache/datafusion-sqlparser-rs/pull/1847 Here are some constructs that are currently not handled by the pretty printer (they are displayed on

Re: [PR] Add support for `DENY` statements [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on PR #1836: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1836#issuecomment-2877018644 This has also been rebased on main & merge conflict resolved -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Add support for table valued functions for SQL Server [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on code in PR #1839: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1839#discussion_r2087214184 ## src/ast/mod.rs: ## @@ -8660,6 +8660,28 @@ pub enum CreateFunctionBody { /// /// [PostgreSQL]: https://www.postgresql.org/docs/current/s

Re: [D] Multiple 'group by's, one scan [datafusion]

2025-05-13 Thread via GitHub
GitHub user pepijnve added a comment to the discussion: Multiple 'group by's, one scan In the meantime I've found https://github.com/duckdb/duckdb/discussions/8445 which is similar to what we're trying to accomplish. In this DuckDB proposal the scan sharing is not explicit in the query plan.

Re: [PR] chores: Add lint rule to enforce string formatting style [datafusion]

2025-05-13 Thread via GitHub
comphead commented on PR #16024: URL: https://github.com/apache/datafusion/pull/16024#issuecomment-2877329019 Thanks everyone -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-13 Thread via GitHub
goldmedal commented on code in PR #14837: URL: https://github.com/apache/datafusion/pull/14837#discussion_r2087328465 ## datafusion-examples/examples/async_udf.rs: ## @@ -0,0 +1,256 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Add support for `DENY` statements [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on code in PR #1836: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1836#discussion_r2087129057 ## src/parser/mod.rs: ## @@ -13020,14 +13035,18 @@ impl<'a> Parser<'a> { GranteesType::Share } else if self.parse_keywo

Re: [PR] fix: stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators [datafusion]

2025-05-13 Thread via GitHub
fmonjalet commented on code in PR #16031: URL: https://github.com/apache/datafusion/pull/16031#discussion_r2087256490 ## datafusion/substrait/src/logical_plan/consumer/expr/scalar_function.rs: ## @@ -124,6 +109,31 @@ pub fn name_to_op(name: &str) -> Option { } } +/// Bui

Re: [PR] Fix big performance issue in string serialization [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb commented on PR #1848: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848#issuecomment-2877005389 Thanks @lovasoa and @jayzhan211 for the review cc @iffyio -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Fix big performance issue in string serialization [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on PR #1848: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848#issuecomment-2877007521 thanks for merging ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Fix big performance issue in string serialization [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb merged PR #1848: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Implement intermediate result blocked approach to aggregation memory management [datafusion]

2025-05-13 Thread via GitHub
Rachelint commented on PR #15591: URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2877460331 Plan to sort out codes and make it ready again today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Track peak memory usage in `SortExec` [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 commented on issue #16042: URL: https://github.com/apache/datafusion/issues/16042#issuecomment-2876879755 @ding-young -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] docs: Add note on setting `core.abbrev` when generating diffs [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove merged PR #1735: URL: https://github.com/apache/datafusion-comet/pull/1735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Linear Aggregate Functions Optimization [datafusion]

2025-05-13 Thread via GitHub
Rachelint commented on issue #15633: URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2877373823 > > Found discord is banned in my current mac (work mac belonging to company), I plan to switch to work on my personal mac and start to communicate on it today later. >

Re: [PR] Introduce Async User Defined Functions [datafusion]

2025-05-13 Thread via GitHub
goldmedal commented on PR #14837: URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2877380478 > That's a use case, but there are others too. Maybe one runs a forecast model, which is a little too complicated to "embed" into the query engine. In that case, we may still want

[I] Support distribution as a MetricValue in ExecutionPlan [datafusion]

2025-05-13 Thread via GitHub
sfluor opened a new issue, #16044: URL: https://github.com/apache/datafusion/issues/16044 ### Is your feature request related to a problem or challenge? The MetricValue enum currently exposes only single-value statistics: counts, gauges, timers, timestamps, and a few hard-coded varian

Re: [PR] Fix big performance issue in string serialization [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on PR #1848: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848#issuecomment-2877002129 @alamb : I meant that the code before this PR handled the string char by char. I don't think this was a regression. -- This is an automated message from the Apache Git

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-05-13 Thread via GitHub
2010YOUY01 commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2876959391 We'll be using Discord for part of the discussion. Feel free to jump in @Rachelint @waynexia and anyone else! https://discord.com/channels/885562378132000778/13718690628831

Re: [I] [EPIC] Improve sqlparser performance [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on issue #1557: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1557#issuecomment-2876982494 @alamb : related: https://github.com/apache/datafusion-sqlparser-rs/pull/1848 (we are currently serializing literal strings character by character) -- This is an

Re: [PR] Fix big performance issue in string serialization [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
alamb commented on PR #1848: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848#issuecomment-2876988268 When you say "old" code do you know what PR introduced this regression? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] [feature] allow pretty-printing [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
lovasoa commented on issue #1845: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1845#issuecomment-2877014030 follow-up tracking issue in https://github.com/apache/datafusion-sqlparser-rs/issues/1850 -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] chores: Add lint rule to enforce string formatting style [datafusion]

2025-05-13 Thread via GitHub
comphead merged PR #16024: URL: https://github.com/apache/datafusion/pull/16024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [I] Add lint rule to enforce string formatting style [datafusion]

2025-05-13 Thread via GitHub
comphead closed issue #16021: Add lint rule to enforce string formatting style URL: https://github.com/apache/datafusion/issues/16021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] fix: stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators [datafusion]

2025-05-13 Thread via GitHub
fmonjalet commented on PR #16031: URL: https://github.com/apache/datafusion/pull/16031#issuecomment-2877173240 Oh right it makes sense, I'll try to produce a test based on `consume_expression` then. Thanks! -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] fix: stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators [datafusion]

2025-05-13 Thread via GitHub
fmonjalet commented on PR #16031: URL: https://github.com/apache/datafusion/pull/16031#issuecomment-2877308068 @jayzhan211 and @gabotechs thank you for your guidance, I updated the PR: - Expressions are not cloned anymore. - The test has been scoped to a much more unit test style, w

[PR] Use human-readable byte sizes in `EXPLAIN` [datafusion]

2025-05-13 Thread via GitHub
tlm365 opened a new pull request, #16043: URL: https://github.com/apache/datafusion/pull/16043 ## Which issue does this PR close? - Closes #16041 . ## Rationale for this change ## What changes are included in this PR? Use human-readable for `spilled_byt

Re: [PR] Docs: Add example of creating a field in `return_field_from_args` [datafusion]

2025-05-13 Thread via GitHub
comphead commented on code in PR #16039: URL: https://github.com/apache/datafusion/pull/16039#discussion_r2087285264 ## datafusion/expr/src/udf.rs: ## @@ -451,7 +451,7 @@ pub trait ScalarUDFImpl: Debug + Send + Sync { /// /// # Notes /// -/// Most UDFs should

Re: [PR] feat: metadata handling for aggregates and window functions [datafusion]

2025-05-13 Thread via GitHub
alamb commented on PR #15911: URL: https://github.com/apache/datafusion/pull/15911#issuecomment-2877038182 🤖: Benchmark completed Details ``` group feat_metadata-handling-aggregates main -

Re: [PR] Add support for table valued functions for SQL Server [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on code in PR #1839: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1839#discussion_r2087154451 ## src/parser/mod.rs: ## @@ -5203,19 +5203,79 @@ impl<'a> Parser<'a> { let (name, args) = self.parse_create_function_name_and_params()?;

Re: [PR] fix: stack overflow for substrait functions with large argument lists that translate to DataFusion binary operators [datafusion]

2025-05-13 Thread via GitHub
jayzhan211 commented on PR #16031: URL: https://github.com/apache/datafusion/pull/16031#issuecomment-2876910540 > @jayzhan211 After looking into it, generating the plan programatically may become very unwieldy (very verbose). Is your concern about the size of the file checked into git, or t

[PR] [WIP] Remove `COMET_SHUFFLE_FALLBACK_TO_COLUMNAR` config [datafusion-comet]

2025-05-13 Thread via GitHub
andygrove opened a new pull request, #1736: URL: https://github.com/apache/datafusion-comet/pull/1736 ## Which issue does this PR close? Closes https://github.com/apache/datafusion-comet/issues/1254 ## Rationale for this change The config `COMET_SHUFFLE_FA

Re: [PR] Add support for table valued functions for SQL Server [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on code in PR #1839: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1839#discussion_r2087207267 ## src/ast/data_type.rs: ## @@ -48,7 +48,15 @@ pub enum DataType { /// Table type in [PostgreSQL], e.g. CREATE FUNCTION RETURNS TABLE(...).

Re: [PR] Add support for table valued functions for SQL Server [datafusion-sqlparser-rs]

2025-05-13 Thread via GitHub
aharpervc commented on code in PR #1839: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1839#discussion_r2087211209 ## src/ast/data_type.rs: ## @@ -48,7 +48,15 @@ pub enum DataType { /// Table type in [PostgreSQL], e.g. CREATE FUNCTION RETURNS TABLE(...).