Re: [PR] fix: graceful NULL and type error handling in array functions [datafusion]

2025-02-24 Thread via GitHub
jayzhan211 commented on code in PR #14737: URL: https://github.com/apache/datafusion/pull/14737#discussion_r1969168798 ## datafusion/functions-nested/src/sort.rs: ## @@ -143,6 +168,10 @@ pub fn array_sort_inner(args: &[ArrayRef]) -> Result { return exec_err!("array_sor

Re: [PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
zzzdong commented on PR #1732: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732#issuecomment-2680992482 Considering the work on the index parsing in #1707, this will be closed. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
zzzdong closed pull request #1732: Add support column prefix index for MySQL URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] replace TypeSignature::String with TypeSignature::Coercible for starts_with [datafusion]

2025-02-24 Thread via GitHub
zjregee commented on PR #14812: URL: https://github.com/apache/datafusion/pull/14812#issuecomment-2680962268 I think this PR is ready for review now. cc: @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] replace TypeSignature::String with TypeSignature::Coercible for trim functions [datafusion]

2025-02-24 Thread via GitHub
zjregee commented on PR #14865: URL: https://github.com/apache/datafusion/pull/14865#issuecomment-2680958642 I think this PR is ready for review now. cc: @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Enable Dataframe to be converted into views which can be used in register_table [datafusion-python]

2025-02-24 Thread via GitHub
kosiew commented on code in PR #1016: URL: https://github.com/apache/datafusion-python/pull/1016#discussion_r1969081210 ## src/dataframe.rs: ## @@ -156,6 +174,22 @@ impl PyDataFrame { PyArrowType(self.df.schema().into()) } +/// Convert this DataFrame into a

Re: [PR] Parse signed/unsigned integer data type in MySQL CAST [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
iffyio commented on code in PR #1739: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1739#discussion_r1969077053 ## src/ast/data_type.rs: ## @@ -238,6 +238,26 @@ pub enum DataType { UnsignedBigInt(Option), /// Unsigned Int8 with optional display width e.g

Re: [PR] feat: adjust create and drop trigger for mysql dialect [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
iffyio merged PR #1734: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
iffyio commented on code in PR #1732: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732#discussion_r1969053806 ## src/parser/mod.rs: ## @@ -7646,6 +7646,39 @@ impl<'a> Parser<'a> { } } +pub fn parse_index_exprs(&mut self) -> Result, ParserErro

Re: [PR] fix: graceful NULL and type error handling in array functions [datafusion]

2025-02-24 Thread via GitHub
alan910127 commented on code in PR #14737: URL: https://github.com/apache/datafusion/pull/14737#discussion_r1969021807 ## datafusion/functions-nested/src/sort.rs: ## @@ -143,6 +168,10 @@ pub fn array_sort_inner(args: &[ArrayRef]) -> Result { return exec_err!("array_sor

Re: [PR] Store spans for Value expressions [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
iffyio merged PR #1738: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Store spans for Value expressions [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
iffyio commented on code in PR #1738: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1738#discussion_r1969034324 ## tests/sqlparser_bigquery.rs: ## @@ -29,19 +29,19 @@ use test_utils::*; #[test] fn parse_literal_string() { let sql = concat!( -"SELECT

Re: [I] [Discussion] Efficient Row Selection for Multi-Engine Support [datafusion]

2025-02-24 Thread via GitHub
Arpit-Bandejiya commented on issue #14816: URL: https://github.com/apache/datafusion/issues/14816#issuecomment-2680785661 Found this PR in arrow-rs : https://github.com/apache/arrow-rs/pull/6624 . @XiangpengHao I see the PR is in draft from sometime. Is there any other we are trying to do i

Re: [I] ExternalSorter Fails to Spill Dictionaries [datafusion]

2025-02-24 Thread via GitHub
davidhewitt commented on issue #4658: URL: https://github.com/apache/datafusion/issues/4658#issuecomment-2680773931 Ok great, I'll work on that today 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[PR] replace TypeSignature::String with TypeSignature::Coercible for trim functions [datafusion]

2025-02-24 Thread via GitHub
zjregee opened a new pull request, #14865: URL: https://github.com/apache/datafusion/pull/14865 ## Which issue does this PR close? - Part of #14759. ## Rationale for this change ## What changes are included in this PR? ## Are these changes tested? Yes. ## Are there any user-

Re: [PR] Examples: boundary analysis example for `AND/OR` conjunctions [datafusion]

2025-02-24 Thread via GitHub
ozankabak commented on PR #14735: URL: https://github.com/apache/datafusion/pull/14735#issuecomment-2680697590 Indeed we will open an epic to plan the remaining statistics infrastructure work. I plan to put something together today/tomorrow. -- This is an automated message from the Apache

Re: [I] [Discussion] Efficient Row Selection for Multi-Engine Support [datafusion]

2025-02-24 Thread via GitHub
bharath-techie commented on issue #14816: URL: https://github.com/apache/datafusion/issues/14816#issuecomment-2680694026 Thanks @chenkovsky for confirming. We are new to datafusion , but at high level looks like this feature will need a deeper integration in the ParquetExec flow and

[PR] Fixed Migrate Datetime functions to invoke_with_args Issue 14705 [datafusion]

2025-02-24 Thread via GitHub
varun-bhardwaj-sde opened a new pull request, #14864: URL: https://github.com/apache/datafusion/pull/14864 ## Which issue does this PR close? - Closes #14705. ## Rationale for this change ## What changes are included in this PR? ## Are t

Re: [PR] Fixed Migrate Datetime functions to invoke_with_args Issue 14705 [datafusion]

2025-02-24 Thread via GitHub
varun-bhardwaj-sde commented on PR #14864: URL: https://github.com/apache/datafusion/pull/14864#issuecomment-2680507929 @niebayes can you please review this once and guide me if I am doing any mistake here -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Fixed Migrate Datetime functions to invoke_with_args Issue 14705 [datafusion]

2025-02-24 Thread via GitHub
varun-bhardwaj-sde commented on PR #14792: URL: https://github.com/apache/datafusion/pull/14792#issuecomment-2680505232 it was closed by misktake i raised another PR https://github.com/apache/datafusion/pull/14864 for this task -- This is an automated message from the Apache Git Service.

Re: [I] Migrate Datetime functions to `invoke_with_args` [datafusion]

2025-02-24 Thread via GitHub
varun-bhardwaj-sde commented on issue #14705: URL: https://github.com/apache/datafusion/issues/14705#issuecomment-2680482983 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Examples: boundary analysis example for `AND/OR` conjunctions [datafusion]

2025-02-24 Thread via GitHub
clflushopt commented on PR #14735: URL: https://github.com/apache/datafusion/pull/14735#issuecomment-2680374034 @alamb @ozankabak do we plan to have open issues for the follow up changes described in #14699 ? I am specifically trying to figure out whether we should address those items as pa

Re: [PR] Examples: boundary analysis example for `AND/OR` conjunctions [datafusion]

2025-02-24 Thread via GitHub
clflushopt commented on PR #14735: URL: https://github.com/apache/datafusion/pull/14735#issuecomment-2680365175 @alamb any suggestions on what to improve here vis-a-vis documentation or the example since #14699 has been merged ? -- This is an automated message from the Apache Git Service.

Re: [PR] Add DataFrame fill_null [datafusion]

2025-02-24 Thread via GitHub
kosiew commented on code in PR #14769: URL: https://github.com/apache/datafusion/pull/14769#discussion_r1967089979 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1926,6 +1930,71 @@ impl DataFrame { plan, }) } + +/// Fill null values in specified c

Re: [PR] fix: fetch is missed during EnforceDistribution [datafusion]

2025-02-24 Thread via GitHub
xudong963 commented on code in PR #14207: URL: https://github.com/apache/datafusion/pull/14207#discussion_r1968751449 ## datafusion/core/tests/physical_optimizer/enforce_distribution.rs: ## @@ -3154,3 +3164,104 @@ fn optimize_away_unnecessary_repartition2() -> Result<()> {

Re: [PR] Add DataFrame fill_null [datafusion]

2025-02-24 Thread via GitHub
kosiew commented on PR #14769: URL: https://github.com/apache/datafusion/pull/14769#issuecomment-2680284864 > its documented in https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrame.html not in the code itself I added fill_null example usage in dataframe/mod.rs for

Re: [PR] Add DataFrame fill_null [datafusion]

2025-02-24 Thread via GitHub
kosiew commented on code in PR #14769: URL: https://github.com/apache/datafusion/pull/14769#discussion_r1968745726 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1926,6 +1930,71 @@ impl DataFrame { plan, }) } + +/// Fill null values in specified c

Re: [PR] Add DataFrame fill_null [datafusion]

2025-02-24 Thread via GitHub
kosiew commented on code in PR #14769: URL: https://github.com/apache/datafusion/pull/14769#discussion_r1968745726 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1926,6 +1930,71 @@ impl DataFrame { plan, }) } + +/// Fill null values in specified c

Re: [PR] chore: remove jackson dependency [datafusion-comet]

2025-02-24 Thread via GitHub
codecov-commenter commented on PR #1442: URL: https://github.com/apache/datafusion-comet/pull/1442#issuecomment-2680235325 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1442?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Implement actual count wildcard in physical layer and fix duplicated schema name error from count wildcard [datafusion]

2025-02-24 Thread via GitHub
jayzhan211 commented on code in PR #14824: URL: https://github.com/apache/datafusion/pull/14824#discussion_r1968726701 ## datafusion/core/tests/dataframe/dataframe_functions.rs: ## @@ -1145,9 +1145,9 @@ async fn test_count_wildcard() -> Result<()> { .build() .u

Re: [PR] Implement actual count wildcard in physical layer and fix duplicated schema name error from count wildcard [datafusion]

2025-02-24 Thread via GitHub
jayzhan211 commented on code in PR #14824: URL: https://github.com/apache/datafusion/pull/14824#discussion_r1968712649 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -2455,7 +2455,7 @@ async fn test_count_wildcard_on_sort() -> Result<()> { let ctx = create_join_context()?

Re: [PR] Adding node_id to ExecutionPlanProperties [datafusion]

2025-02-24 Thread via GitHub
github-actions[bot] closed pull request #12186: Adding node_id to ExecutionPlanProperties URL: https://github.com/apache/datafusion/pull/12186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Feat: Implement hf:// / "hugging face" integration in datafusion-cli [datafusion]

2025-02-24 Thread via GitHub
github-actions[bot] closed pull request #10792: Feat: Implement hf:// / "hugging face" integration in datafusion-cli URL: https://github.com/apache/datafusion/pull/10792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
kazuyukitanimura commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968689241 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSpa

Re: [I] [Epic] Split datasources out from `datafusion` crate (`datafusion/core`) [datafusion]

2025-02-24 Thread via GitHub
AdamGS commented on issue #1: URL: https://github.com/apache/datafusion/issues/1#issuecomment-2679630951 https://github.com/apache/datafusion/pull/14838 turned out to be easier than I thought, mostly because @logan-keede did much of the work to move test infrastructure over. -- T

[PR] chore: remove jackson dependency [datafusion-comet]

2025-02-24 Thread via GitHub
kazuyukitanimura opened a new pull request, #1442: URL: https://github.com/apache/datafusion-comet/pull/1442 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? Removed jackson dependency ## How

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-02-24 Thread via GitHub
shehabgamin commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2680109294 I will test with Sail by Wednesday! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968655592 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [I] Browser-accessible official DataFusion playground / DataFusion fiddle [datafusion]

2025-02-24 Thread via GitHub
waynexia commented on issue #13818: URL: https://github.com/apache/datafusion/issues/13818#issuecomment-2680102365 I'm imaging a highly encapsulated JS component like ```tsx ``` and the user who wants an interactive terminal can render a DataFusion webpage with one line,

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968654009 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [PR] DataFusion 45 blog post [datafusion-site]

2025-02-24 Thread via GitHub
phillipleblanc commented on code in PR #57: URL: https://github.com/apache/datafusion-site/pull/57#discussion_r1968653103 ## content/blog/2025-02-20-datafusion-45.0.0.md: ## @@ -0,0 +1,315 @@ +--- +layout: post +title: Apache DataFusion 45.0.0 Released +date: 2025-02-20 +author:

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968652251 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
himadripal commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968652251 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSparkPlan

Re: [PR] refactor: use TypeSignature::Coercible for crypto functions [datafusion]

2025-02-24 Thread via GitHub
jayzhan211 commented on PR #14826: URL: https://github.com/apache/datafusion/pull/14826#issuecomment-2680085289 > [SQL] EXPLAIN SELECT digest(column1_utf8view, 'md5') as c FROM test; [Diff] (-expected|+actual) logical_plan - 01)Projection: digest(test.column1_utf8view,

Re: [PR] fix: enable full decimal to decimal support [datafusion-comet]

2025-02-24 Thread via GitHub
kazuyukitanimura commented on code in PR #1385: URL: https://github.com/apache/datafusion-comet/pull/1385#discussion_r1968649099 ## spark/src/test/scala/org/apache/comet/CometCastSuite.scala: ## @@ -1126,27 +1129,33 @@ class CometCastSuite extends CometTestBase with AdaptiveSpa

Re: [I] Blog / Example of how to compile DataFusion to WASM [datafusion]

2025-02-24 Thread via GitHub
waynexia commented on issue #13715: URL: https://github.com/apache/datafusion/issues/13715#issuecomment-2680048213 There is a developer guide for building, debugging, and publishing the WASM bindings: https://github.com/datafusion-contrib/datafusion-wasm-bindings/blob/main/CONTRIBUTING.md

Re: [PR] test: Register Spark-compatible expressions with a DataFusion context [datafusion-comet]

2025-02-24 Thread via GitHub
andygrove merged PR #1432: URL: https://github.com/apache/datafusion-comet/pull/1432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-02-24 Thread via GitHub
djanderson commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1968617896 ## datafusion-examples/examples/thread_pools_lib/dedicated_executor.rs: ## @@ -0,0 +1,1778 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

Re: [I] Add example to spark-expr crate [datafusion-comet]

2025-02-24 Thread via GitHub
andygrove closed issue #1365: Add example to spark-expr crate URL: https://github.com/apache/datafusion-comet/issues/1365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] [Discussion] Efficient Row Selection for Multi-Engine Support [datafusion]

2025-02-24 Thread via GitHub
chenkovsky commented on issue #14816: URL: https://github.com/apache/datafusion/issues/14816#issuecomment-2679992458 > Hi @chenkovsky , > Thanks a ton for quick POC on this. :) > > The row ids seems to be specific to each batch and not across the entire parquet file - is my unders

[PR] Workaround for compilation error due to rkyv#434. [datafusion]

2025-02-24 Thread via GitHub
ryzhyk opened a new pull request, #14863: URL: https://github.com/apache/datafusion/pull/14863 ## Which issue does this PR close? - Closes #14862 ## Rationale for this change When datafusion is used in a workspace that enables the `rkyv-64` feature in the `chrono` crate,

[I] Compilation error due to rkyv/rkyv#434 [datafusion]

2025-02-24 Thread via GitHub
ryzhyk opened a new issue, #14862: URL: https://github.com/apache/datafusion/issues/14862 ### Describe the bug When datafusion is used in a workspace that enables the `rkyv-64` feature in the `chrono` crate, this triggers a Rust compilation error: ``` error[E0277]: can't

Re: [PR] feat: use edition 2024 [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
alamb commented on PR #1736: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1736#issuecomment-2679691606 > Adding an MSRV sounds good to me! Filed an issue to track: - https://github.com/apache/datafusion-sqlparser-rs/issues/1744 -- This is an automated message from

Re: [PR] Set projection before configuring the source [datafusion]

2025-02-24 Thread via GitHub
milenkovicm commented on PR #14685: URL: https://github.com/apache/datafusion/pull/14685#issuecomment-2679834606 Would it make sense to add a test which would cover plan round trip? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] DeltaLake integration not working (Python) [datafusion]

2025-02-24 Thread via GitHub
mag1cfrog commented on issue #14842: URL: https://github.com/apache/datafusion/issues/14842#issuecomment-2679777967 not sure really relevant, but I'm also having some compilation issue when trying to use deltalake with datafusion in Rust. Here's what I get: ```bash Compiling

Re: [PR] Refactor SortPushdown using the standard top-down visitor. [datafusion]

2025-02-24 Thread via GitHub
alamb commented on code in PR #14821: URL: https://github.com/apache/datafusion/pull/14821#discussion_r1968482276 ## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ## @@ -98,66 +98,91 @@ fn pushdown_sorts_helper( .ordering_satisfy_requirement(&paren

Re: [PR] Refactor SortPushdown using the standard top-down visitor. [datafusion]

2025-02-24 Thread via GitHub
alamb commented on code in PR #14821: URL: https://github.com/apache/datafusion/pull/14821#discussion_r1968480652 ## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ## @@ -98,66 +98,91 @@ fn pushdown_sorts_helper( .ordering_satisfy_requirement(&paren

Re: [I] Migrate Unicode function to `invoke_with_args` [datafusion]

2025-02-24 Thread via GitHub
alamb closed issue #14709: Migrate Unicode function to `invoke_with_args` URL: https://github.com/apache/datafusion/issues/14709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add `range` table function [datafusion]

2025-02-24 Thread via GitHub
alamb merged PR #14830: URL: https://github.com/apache/datafusion/pull/14830 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Create an MSRV policy in this crate [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
mvzink commented on issue #1744: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1744#issuecomment-2679756348 From looking into #1612, some test code uses associated type bounds ([stabilized in 1.79](https://github.com/rust-lang/rust/pull/122055/)). It could be rewritten if a

Re: [PR] chore: migrate invoke_batch to invoke_with_args for unicode function [datafusion]

2025-02-24 Thread via GitHub
alamb commented on PR #14856: URL: https://github.com/apache/datafusion/pull/14856#issuecomment-2679749126 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] chore: migrate invoke_batch to invoke_with_args for unicode function [datafusion]

2025-02-24 Thread via GitHub
alamb merged PR #14856: URL: https://github.com/apache/datafusion/pull/14856 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add `range` table function [datafusion]

2025-02-24 Thread via GitHub
alamb commented on PR #14830: URL: https://github.com/apache/datafusion/pull/14830#issuecomment-2679748456 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] fix(substrait): Do not add implicit groupBy expressions when building logical plans from Substrait [datafusion]

2025-02-24 Thread via GitHub
alamb commented on code in PR #14860: URL: https://github.com/apache/datafusion/pull/14860#discussion_r1968445509 ## datafusion/expr/src/logical_plan/builder.rs: ## @@ -63,6 +64,32 @@ use indexmap::IndexSet; /// Default table name for unnamed table pub const UNNAMED_TABLE: &st

Re: [PR] Store spans for Value expressions [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
lovasoa commented on code in PR #1738: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1738#discussion_r1968471231 ## src/ast/mod.rs: ## @@ -8789,9 +8796,9 @@ mod tests { #[test] fn test_interval_display() { let interval = Expr::Interval(Interval

Re: [PR] Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [datafusion]

2025-02-24 Thread via GitHub
alamb commented on code in PR #14838: URL: https://github.com/apache/datafusion/pull/14838#discussion_r1968468146 ## datafusion/datasource/src/file_scan_config.rs: ## @@ -15,19 +15,611 @@ // specific language governing permissions and limitations // under the License. -use s

Re: [PR] fix(substrait): Do not add implicit groupBy expressions when building logical plans from Substrait [datafusion]

2025-02-24 Thread via GitHub
alamb commented on PR #14860: URL: https://github.com/apache/datafusion/pull/14860#issuecomment-2679711714 Hi @anlinc It looks like there is a small clippy failure with this PR: https://github.com/apache/datafusion/actions/runs/13507870042/job/37742937287?pr=14860 -- This

Re: [I] Improve join building performance [datafusion]

2025-02-24 Thread via GitHub
Dandandan closed issue #14859: Improve join building performance URL: https://github.com/apache/datafusion/issues/14859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Improve performance of join building performance (up to 1.28x speedup) [datafusion]

2025-02-24 Thread via GitHub
Dandandan closed pull request #14861: Improve performance of join building performance (up to 1.28x speedup) URL: https://github.com/apache/datafusion/pull/14861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[I] Create an MSRV policy in this crate [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
alamb opened a new issue, #1744: URL: https://github.com/apache/datafusion-sqlparser-rs/issues/1744 TLDR woudl be: 1. Define a MSRV policy (maybe copy the existing DataFusion one) 2. Implement some sort of CI check (again can copy / reuse the datafusion one if we want) 3.

Re: [PR] Store spans for Value expressions [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
alamb commented on code in PR #1738: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1738#discussion_r1968440688 ## src/ast/mod.rs: ## @@ -8789,9 +8796,9 @@ mod tests { #[test] fn test_interval_display() { let interval = Expr::Interval(Interval {

Re: [I] Support User-Defined Sorting [datafusion]

2025-02-24 Thread via GitHub
tobixdev commented on issue #14828: URL: https://github.com/apache/datafusion/issues/14828#issuecomment-2679679211 So i dug a little deeper on how we could implement this functionality. ### arrow-rs Firstly, I think we require two "flavors" of sorting - one for `arrow-ord` and

Re: [PR] fix(substrait): Do not add implicit groupBy expressions when building logical plans from Substrait [datafusion]

2025-02-24 Thread via GitHub
anlinc commented on PR #14553: URL: https://github.com/apache/datafusion/pull/14553#issuecomment-2679664474 @Blizzara @alamb I am closing this in favor of the latest iteration here: https://github.com/apache/datafusion/pull/14860, which addresses the discussions in this PR. -- This is an

Re: [PR] fix(substrait): Do not add implicit groupBy expressions when building logical plans from Substrait [datafusion]

2025-02-24 Thread via GitHub
anlinc closed pull request #14553: fix(substrait): Do not add implicit groupBy expressions when building logical plans from Substrait URL: https://github.com/apache/datafusion/pull/14553 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[PR] Improve performance of join building performance (up to 1.28x speedup) [datafusion]

2025-02-24 Thread via GitHub
Dandandan opened a new pull request, #14861: URL: https://github.com/apache/datafusion/pull/14861 ## Which issue does this PR close? - Closes #14859 ## Rationale for this change Performance improvements for join (up to 1.28x) ```

Re: [PR] build(deps): bump prost-types from 0.13.4 to 0.13.5 [datafusion-python]

2025-02-24 Thread via GitHub
dependabot[bot] commented on PR #1021: URL: https://github.com/apache/datafusion-python/pull/1021#issuecomment-2679578887 Looks like prost-types is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Chore: Release datafusion-python 45 [datafusion-python]

2025-02-24 Thread via GitHub
kevinjqliu commented on PR #1024: URL: https://github.com/apache/datafusion-python/pull/1024#issuecomment-2679610682 woot! I see 45.2.0 on https://pypi.org/project/datafusion/ thanks for working on the release! -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Chore: Release datafusion-python 45 [datafusion-python]

2025-02-24 Thread via GitHub
timsaucer merged PR #1024: URL: https://github.com/apache/datafusion-python/pull/1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] build(deps): bump prost-types from 0.13.4 to 0.13.5 [datafusion-python]

2025-02-24 Thread via GitHub
dependabot[bot] closed pull request #1021: build(deps): bump prost-types from 0.13.4 to 0.13.5 URL: https://github.com/apache/datafusion-python/pull/1021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] build(deps): bump prost from 0.13.4 to 0.13.5 [datafusion-python]

2025-02-24 Thread via GitHub
dependabot[bot] commented on PR #1022: URL: https://github.com/apache/datafusion-python/pull/1022#issuecomment-267957 Looks like prost is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] build(deps): bump prost from 0.13.4 to 0.13.5 [datafusion-python]

2025-02-24 Thread via GitHub
dependabot[bot] closed pull request #1022: build(deps): bump prost from 0.13.4 to 0.13.5 URL: https://github.com/apache/datafusion-python/pull/1022 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [WIP] Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [datafusion]

2025-02-24 Thread via GitHub
AdamGS commented on code in PR #14838: URL: https://github.com/apache/datafusion/pull/14838#discussion_r1968368078 ## datafusion/datasource/src/file_scan_config.rs: ## @@ -15,19 +15,611 @@ // specific language governing permissions and limitations // under the License. -use

Re: [PR] chore: Strip debuginfo symbols for release [datafusion]

2025-02-24 Thread via GitHub
comphead commented on PR #14843: URL: https://github.com/apache/datafusion/pull/14843#issuecomment-2679541024 Hey @rkrishn7 this is a good call yes, although DF does not catch panics, we cannot say the for the dependencies. I'll experiment later with panic mode for DF crate level only, leav

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-02-24 Thread via GitHub
alamb commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2679454880 I made a PR for testing in delta here: - https://github.com/delta-io/delta-rs/pull/3261 Still has some issues to work out -- This is an automated message from the Apac

Re: [PR] chore: Strip debuginfo symbols for release [datafusion]

2025-02-24 Thread via GitHub
rkrishn7 commented on PR #14843: URL: https://github.com/apache/datafusion/pull/14843#issuecomment-2679441117 > DF does not catch panics, so the process will crash anyway no matter what the setting is. @comphead If I'm not mistaken, DF may catch panics in certain cases. For exa

[I] Experimental native scan test failures [datafusion-comet]

2025-02-24 Thread via GitHub
mbutrovich opened a new issue, #1441: URL: https://github.com/apache/datafusion-comet/issues/1441 We have two experimental native scans based on DataFusion's ParquetExec operator. This issue will track the remaining test failures and any related notes as we bring the test failures down to z

Re: [PR] Parse signed/unsigned integer data types correctly in MySQL CAST [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
mvzink commented on code in PR #1739: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1739#discussion_r1968070815 ## src/ast/mod.rs: ## @@ -798,9 +798,15 @@ pub enum Expr { kind: CastKind, expr: Box, data_type: DataType, -// Option

Re: [PR] [WIP] Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [datafusion]

2025-02-24 Thread via GitHub
AdamGS commented on code in PR #14838: URL: https://github.com/apache/datafusion/pull/14838#discussion_r1968197144 ## datafusion/datasource/Cargo.toml: ## @@ -69,6 +71,7 @@ xz2 = { version = "0.1", optional = true, features = ["static"] } zstd = { version = "0.13", optional =

Re: [PR] chore: Re-organize shuffle writer code [datafusion-comet]

2025-02-24 Thread via GitHub
mbutrovich commented on PR #1439: URL: https://github.com/apache/datafusion-comet/pull/1439#issuecomment-2679374261 This is a great help as we look to improve the shuffle performance. Thanks @andygrove! -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] chore: Strip debuginfo symbols for release [datafusion]

2025-02-24 Thread via GitHub
comphead commented on code in PR #14843: URL: https://github.com/apache/datafusion/pull/14843#discussion_r1968211931 ## Cargo.toml: ## @@ -159,19 +159,20 @@ url = "2.5.4" [profile.release] codegen-units = 1 lto = true +debug = false Review Comment: Agree -- This is an

Re: [I] Remove the need for registering an ObjectStore for remote files [datafusion-python]

2025-02-24 Thread via GitHub
robtandy commented on issue #899: URL: https://github.com/apache/datafusion-python/issues/899#issuecomment-2679361030 @kylebarron What are your thoughts on how to approach this? I'm happy to try to address it and submit a PR for it. I'd like the same thing for github.com/apache/data

Re: [PR] chore: Strip debuginfo symbols for release [datafusion]

2025-02-24 Thread via GitHub
comphead commented on PR #14843: URL: https://github.com/apache/datafusion/pull/14843#issuecomment-2679349895 DF does not catch panics, so the process will crash anyway no matter what the setting is. Although there are some test cases which does `catch_unwind` but we do run tests in deb

Re: [PR] Support bounds evaluation for temporal data types [datafusion]

2025-02-24 Thread via GitHub
ch-sc commented on code in PR #14523: URL: https://github.com/apache/datafusion/pull/14523#discussion_r1968018999 ## datafusion/physical-expr/src/expressions/in_list.rs: ## @@ -398,6 +399,51 @@ impl PhysicalExpr for InListExpr { self.static_filter.clone(),

Re: [PR] [WIP] Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [datafusion]

2025-02-24 Thread via GitHub
AdamGS commented on code in PR #14838: URL: https://github.com/apache/datafusion/pull/14838#discussion_r1968201066 ## datafusion/core/Cargo.toml: ## @@ -40,7 +40,7 @@ nested_expressions = ["datafusion-functions-nested"] # This feature is deprecated. Use the `nested_expressions`

Re: [PR] [WIP] Move `FileSourceConfig` and `FileStream` to the new `datafusion-datasource` [datafusion]

2025-02-24 Thread via GitHub
AdamGS commented on code in PR #14838: URL: https://github.com/apache/datafusion/pull/14838#discussion_r1968175868 ## datafusion/common/src/test_util.rs: ## @@ -28,7 +28,7 @@ use std::{error::Error, path::PathBuf}; /// /// Expects to be called about like this: /// -/// `asser

Re: [I] ExternalSorter Fails to Spill Dictionaries [datafusion]

2025-02-24 Thread via GitHub
tustvold commented on issue #4658: URL: https://github.com/apache/datafusion/issues/4658#issuecomment-2679171994 > I assume by row format you mean [arrow-row](https://arrow.apache.org/rust/arrow_row/index.html), however it's not clear to me if there's a standard way to serialize these to a

Re: [I] [Discussion] Efficient Row Selection for Multi-Engine Support [datafusion]

2025-02-24 Thread via GitHub
bharath-techie commented on issue #14816: URL: https://github.com/apache/datafusion/issues/14816#issuecomment-2679245307 Hi @chenkovsky , Thanks a ton for quick POC on this. :) The row ids seems to be specific to each batch and not across the entire parquet file - is my understand

Re: [I] Optimized version of `SortPreservingMerge` that doesn't actually compare sort keys of the key ranges are ordered [datafusion]

2025-02-24 Thread via GitHub
alamb commented on issue #10316: URL: https://github.com/apache/datafusion/issues/10316#issuecomment-2679231481 FWIW I am working on the general analysis to support this operator for some unrelated reason in InfluxDB -- I plan to propose upstreaming it when complete. -- This is an automa

Re: [PR] Store spans for Value expressions [datafusion-sqlparser-rs]

2025-02-24 Thread via GitHub
lovasoa commented on PR #1738: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1738#issuecomment-2679141697 It would be nice if we could get this merged before other pending PRs, because it touches almost all the tests, and is guaranteed to generate big conflicts as we change t

Re: [PR] Support bounds evaluation for temporal data types [datafusion]

2025-02-24 Thread via GitHub
ch-sc commented on code in PR #14523: URL: https://github.com/apache/datafusion/pull/14523#discussion_r1967699042 ## datafusion/expr/src/udf.rs: ## @@ -717,9 +722,18 @@ pub trait ScalarUDFImpl: Debug + Send + Sync { /// /// If the function is `ABS(a)`, and the input in

  1   2   3   >