Re: [PR] Feature Unifying source execution plans [datafusion]

2025-01-21 Thread via GitHub
mertak-synnada commented on code in PR #14224: URL: https://github.com/apache/datafusion/pull/14224#discussion_r1923766886 ## datafusion/core/src/datasource/data_source.rs: ## @@ -0,0 +1,264 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
UBarney commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923793794 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { let

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
UBarney commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923793794 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { let

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
UBarney commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923792688 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { let

Re: [PR] fix: Improve testing for array_remove and fallback to Spark for unsupported types [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on code in PR #1308: URL: https://github.com/apache/datafusion-comet/pull/1308#discussion_r1921585675 ## spark/src/test/scala/org/apache/comet/DataGenerator.scala: ## @@ -141,4 +146,154 @@ class DataGenerator(r: Random) { Range(0, num).map(_ => generateR

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
UBarney commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923792688 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { let

Re: [I] RFC: Should we remove pyarrow feature from datafusion core [datafusion]

2025-01-21 Thread via GitHub
Omega359 commented on issue #14197: URL: https://github.com/apache/datafusion/issues/14197#issuecomment-2604873558 agreed, remove. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on PR #14223: URL: https://github.com/apache/datafusion/pull/14223#issuecomment-2604947472 Perhaps we should add a sqllogictest test for #14208. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
nuno-faria commented on PR #14223: URL: https://github.com/apache/datafusion/pull/14223#issuecomment-2605015538 > Perhaps we should add a sqllogictest test for #14208. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Any plan to support flink [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on issue #1311: URL: https://github.com/apache/datafusion-comet/issues/1311#issuecomment-2605048596 Gluten plans on adding Flink support in the future according to their [latest talk](https://www.youtube.com/watch?v=GWTj3INSzPg). https://github.com/user-attachment

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
Dandandan commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923260496 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { le

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
Dandandan commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923260496 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { le

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
Dandandan commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1923260496 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { le

Re: [PR] Parse Snowflake COPY INTO [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
alamb commented on code in PR #1669: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1669#discussion_r1923339360 ## src/ast/spans.rs: ## @@ -342,6 +343,15 @@ impl Spanned for Statement { copy_options: _, validation_mode: _,

Re: [PR] Add support for Snowflake account privileges [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
alamb commented on PR #1666: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1666#issuecomment-2604090697 > Hey @alamb I'd like that, yes. Let me try and see how it goes :-) Thanks @yoavcloud -- I look forward to continuted collaboration. This is going to be great --

Re: [PR] Move EnforceSorting into datafusion-physical-optimizer crate [datafusion]

2025-01-21 Thread via GitHub
berkaysynnada commented on PR #14219: URL: https://github.com/apache/datafusion/pull/14219#issuecomment-2604309137 > cc @berkaysynnada There are two function having the same name (stream_exec_ordered), one in replace_with_order_preserving_variants.rs with a built-in projection, and on

Re: [PR] Add related source code locations to errors [datafusion]

2025-01-21 Thread via GitHub
eliaperantoni commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1923538093 ## datafusion/sql/src/expr/mod.rs: ## @@ -165,13 +177,126 @@ impl SqlToRel<'_, S> { schema: &DFSchema, planner_context: &mut PlannerContext

Re: [I] Update ballista python dependencies [datafusion-ballista]

2025-01-21 Thread via GitHub
milenkovicm closed issue #1132: Update ballista python dependencies URL: https://github.com/apache/datafusion-ballista/issues/1132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] Update ballista python dependencies [datafusion-ballista]

2025-01-21 Thread via GitHub
milenkovicm commented on issue #1132: URL: https://github.com/apache/datafusion-ballista/issues/1132#issuecomment-2604285161 this should mostly be done, closing this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Parse Postgres's LOCK TABLE statement [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
alamb commented on code in PR #1614: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1614#discussion_r1923330659 ## src/ast/mod.rs: ## @@ -7278,16 +7279,126 @@ impl fmt::Display for SearchModifier { } } +/// A `LOCK TABLE ..` statement. MySQL and Postgres va

Re: [PR] Parse Postgres's LOCK TABLE statement [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
alamb commented on PR #1614: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1614#issuecomment-2604104920 Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look -- This is an automated messag

Re: [PR] Move EnforceSorting into datafusion-physical-optimizer crate [datafusion]

2025-01-21 Thread via GitHub
buraksenn commented on PR #14219: URL: https://github.com/apache/datafusion/pull/14219#issuecomment-2604106674 cc @berkaysynnada -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
berkaysynnada commented on PR #14190: URL: https://github.com/apache/datafusion/pull/14190#issuecomment-2604469608 Hi @logan-keede. Thank you for working on this exhaustive issue. The `join_selection` rule has been moved to the **physical_optimizer** crate, but `enforce_sorting`, `enforce_d

Re: [PR] Update logical-types to main [datafusion]

2025-01-21 Thread via GitHub
jayzhan211 commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2604471874 We can revert revert it, but I think what you did is correct, not sure what happen 🤔 ? -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] Update logical-types to main [datafusion]

2025-01-21 Thread via GitHub
jayzhan211 commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2604478227 I force push to the previous commit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] Feature Unifying source execution plans [datafusion]

2025-01-21 Thread via GitHub
mertak-synnada opened a new pull request, #14224: URL: https://github.com/apache/datafusion/pull/14224 ## Which issue does this PR close? Closes #13838. ## Rationale for this change This PR merges all Data sources into one Execution Plan, named DataSourceExec, an

Re: [PR] Update logical-types to main [datafusion]

2025-01-21 Thread via GitHub
tobixdev commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2604503486 Thanks! Should I open a new PR so that we can try it with the new branch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
nuno-faria opened a new pull request, #14223: URL: https://github.com/apache/datafusion/pull/14223 Previously, when combining `UInt64` with any signed integer, the resulting type would be `Int64`, which would result in lost information. Now, combining `UInt64` with a signed integer results

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
logan-keede commented on PR #14190: URL: https://github.com/apache/datafusion/pull/14190#issuecomment-2604671660 @berkaysynnada This PR is pretty much in its last stretch, I was facing some problems with tests, but I think I have got it now, additionally @buraksenn should be pushing some co

Re: [PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
nuno-faria commented on code in PR #14223: URL: https://github.com/apache/datafusion/pull/14223#discussion_r1923882417 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -777,29 +777,20 @@ pub fn binary_numeric_coercion( (_, Float32) | (Float32, _) => Some(Flo

Re: [I] Any plan to support flink [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on issue #1311: URL: https://github.com/apache/datafusion-comet/issues/1311#issuecomment-2604974259 It looks like I was mistaken about Gluten supporting Flink. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] fix concat ws simplify [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on code in PR #14213: URL: https://github.com/apache/datafusion/pull/14213#discussion_r1923884148 ## datafusion/sqllogictest/test_files/expr.slt: ## @@ -465,6 +465,11 @@ SELECT concat_ws('|','a',NULL,NULL) a +query T +SELECT concat_ws('','a','b','c')

Re: [PR] fix: [comet-parquet-exec] Fix regression in supported types [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove merged PR #1309: URL: https://github.com/apache/datafusion-comet/pull/1309 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on code in PR #14223: URL: https://github.com/apache/datafusion/pull/14223#discussion_r1923866675 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -777,29 +777,20 @@ pub fn binary_numeric_coercion( (_, Float32) | (Float32, _) => Some(Float

Re: [PR] fix(expr-common): Coerce to Decimal(20, 0) when combining UInt64 with signed integers [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on code in PR #14223: URL: https://github.com/apache/datafusion/pull/14223#discussion_r1923910753 ## datafusion/expr-common/src/type_coercion/binary.rs: ## @@ -777,29 +777,20 @@ pub fn binary_numeric_coercion( (_, Float32) | (Float32, _) => Some(Float

Re: [PR] feat: remove DataFusion pyarrow feat [datafusion-python]

2025-01-21 Thread via GitHub
kylebarron commented on PR #1000: URL: https://github.com/apache/datafusion-python/pull/1000#issuecomment-2605913460 > By removing the `pyarrow` dependency of DataFusion we can update `pyo3` in without requiring corresponding updates to the DataFusion core repository. FWIW this is a

Re: [PR] Feat: Support array_join function [datafusion-comet]

2025-01-21 Thread via GitHub
erenavsarogullari commented on PR #1290: URL: https://github.com/apache/datafusion-comet/pull/1290#issuecomment-2605910996 Thanks @andygrove. Sure, i will have a look in today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Merge SortMergeJoin filtered batches into larger batches [datafusion]

2025-01-21 Thread via GitHub
comphead commented on PR #14160: URL: https://github.com/apache/datafusion/pull/14160#issuecomment-2605200055 > > > BatchCoalescer > > > > > > Thanks @berkaysynnada for your feedback, if I got you right, you prefer to call the `CoalesceBatchesExec` just AFTER the `SortMergeJoinExe

Re: [I] Move `EnforceSorting` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
buraksenn commented on issue #14185: URL: https://github.com/apache/datafusion/issues/14185#issuecomment-2605292627 Sorry I could not look into this afterwards and currently working. I think my PR is ready for merge so maybe we can merge that one and then rebease from main. Otherwise, I can

[PR] chore: merge from main [datafusion-comet]

2025-01-21 Thread via GitHub
parthchandra opened a new pull request, #1317: URL: https://github.com/apache/datafusion-comet/pull/1317 merge from main -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Making the data_imdb and clickbench_1 functions atomic. [datafusion]

2025-01-21 Thread via GitHub
Spaarsh commented on PR #14129: URL: https://github.com/apache/datafusion/pull/14129#issuecomment-2605437829 The suggested changes have been implemented and have been committed via another PR #14225. Hence I am closing this PR. -- This is an automated message from the Apache Git Service.

[PR] Made imdb download (data_imdb) function atomic [datafusion]

2025-01-21 Thread via GitHub
Spaarsh opened a new pull request, #14225: URL: https://github.com/apache/datafusion/pull/14225 ## Which issue does this PR close? Closes #14128 ## Rationale for this change Due to non-atomic downloads, the user would need to manually remove files/folders created by

Re: [PR] Making the data_imdb and clickbench_1 functions atomic. [datafusion]

2025-01-21 Thread via GitHub
Spaarsh closed pull request #14129: Making the data_imdb and clickbench_1 functions atomic. URL: https://github.com/apache/datafusion/pull/14129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
logan-keede commented on code in PR #14190: URL: https://github.com/apache/datafusion/pull/14190#discussion_r1924024388 ## datafusion/physical-optimizer/Cargo.toml: ## @@ -48,6 +49,7 @@ futures = { workspace = true } itertools = { workspace = true } log = { workspace = true }

[PR] Only support escape literals for Postgres, Redshift and generic dialect [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
hansott opened a new pull request, #1674: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1674 Example in MySQL: https://github.com/user-attachments/assets/7c4d652e-d78f-4b8d-88b5-b8d7f010db97"; /> -- This is an automated message from the Apache Git Service. To resp

Re: [I] Move `EnforceSorting` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
logan-keede commented on issue #14185: URL: https://github.com/apache/datafusion/issues/14185#issuecomment-2605311570 No worries, I did it myself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Support deltalake [datafusion-ballista]

2025-01-21 Thread via GitHub
milenkovicm commented on issue #456: URL: https://github.com/apache/datafusion-ballista/issues/456#issuecomment-2605730109 latest ballista can be extended to support delta table, an example can be found at https://github.com/milenkovicm/ballista_delta -- This is an automated message from

[PR] wip: Another attempt at merging comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove opened a new pull request, #1318: URL: https://github.com/apache/datafusion-comet/pull/1318 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[PR] fix: do not compile `keda.proto` if feature not used. [datafusion-ballista]

2025-01-21 Thread via GitHub
milenkovicm opened a new pull request, #1168: URL: https://github.com/apache/datafusion-ballista/pull/1168 # Which issue does this PR close? Closes None. # Rationale for this change Scheduler compiles `keda.proto` even if it's disabled, compilation requires `protoc` ins

Re: [PR] Move `EnforceDistribution` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
logan-keede commented on PR #14190: URL: https://github.com/apache/datafusion/pull/14190#issuecomment-2605221257 To move tests to `datafusion/core/tests/physical_optimizer`, I choose to make a few required function public, I am not sure if that is detrimental to code quality or not. Al

[PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
UBarney opened a new pull request, #14195: URL: https://github.com/apache/datafusion/pull/14195 ## Which issue does this PR close? Closes #12445. ## Rationale for this change See the issue. ## What changes are included in this PR? + If all charac

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
Dandandan commented on code in PR #14195: URL: https://github.com/apache/datafusion/pull/14195#discussion_r1924121585 ## datafusion/functions/src/unicode/reverse.rs: ## @@ -119,12 +119,21 @@ fn reverse_impl<'a, T: OffsetSizeTrait, V: StringArrayType<'a>>( ) -> Result { le

Re: [PR] Faster reverse() string function for ASCII-only case [datafusion]

2025-01-21 Thread via GitHub
Dandandan closed pull request #14195: Faster reverse() string function for ASCII-only case URL: https://github.com/apache/datafusion/pull/14195 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Add documentation for `<=>` operator [datafusion]

2025-01-21 Thread via GitHub
comphead closed issue #14203: Add documentation for `<=>` operator URL: https://github.com/apache/datafusion/issues/14203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Added documentation for the Spaceship Operator (<=>) [datafusion]

2025-01-21 Thread via GitHub
comphead merged PR #14214: URL: https://github.com/apache/datafusion/pull/14214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

[I] Profile memory usage and add guidance to the tuning guide [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove opened a new issue, #1315: URL: https://github.com/apache/datafusion-comet/issues/1315 ### What is the problem the feature request solves? We do not yet have any documentation explaining how much off-heap memory Comet uses compared to Spark. We should profile this and add do

Re: [I] Result mismatch with vanilla spark in hash function with decimal input [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on issue #1294: URL: https://github.com/apache/datafusion-comet/issues/1294#issuecomment-2606013700 I found more differences through fuzz testing. Query is `SELECT a, xxhash64(a)` for `Decimal(36,18)` input. ``` !== Correct Answer - 100 ==

Re: [I] Support Rust UDF [datafusion-ballista]

2025-01-21 Thread via GitHub
milenkovicm commented on issue #993: URL: https://github.com/apache/datafusion-ballista/issues/993#issuecomment-2605113194 I believe at this point ballista provides extension functionality for users to implement this functionality themselves if needed. Implementation is not trivial, hence

Re: [PR] chore: [comet-parquet-exec] enable native scan by default (again) [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove merged PR #1302: URL: https://github.com/apache/datafusion-comet/pull/1302 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[PR] chore: Merge remote-tracking branch 'apache/main' into comet-parquet-exec - 20240121 [datafusion-comet]

2025-01-21 Thread via GitHub
parthchandra opened a new pull request, #1316: URL: https://github.com/apache/datafusion-comet/pull/1316 Bring up to date with `main` There were no files changed to resolve the merge conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] chore: Merge remote-tracking branch 'apache/main' into comet-parquet-exec - 20240121 [datafusion-comet]

2025-01-21 Thread via GitHub
parthchandra commented on PR #1316: URL: https://github.com/apache/datafusion-comet/pull/1316#issuecomment-2605596625 @andygrove @mbutrovich -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] chore: Merge remote-tracking branch 'apache/main' into comet-parquet-exec - 20240121 [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove merged PR #1316: URL: https://github.com/apache/datafusion-comet/pull/1316 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Added documentation for the Spaceship Operator (<=>) [datafusion]

2025-01-21 Thread via GitHub
comphead commented on code in PR #14214: URL: https://github.com/apache/datafusion/pull/14214#discussion_r1924035483 ## docs/source/user-guide/sql/operators.md: ## @@ -110,6 +110,7 @@ Modulo (remainder) - [<= (less than or equal to)](#op_le) - [> (greater than)](#op_gt) - [>=

Re: [PR] Added documentation for the Spaceship Operator (<=>) [datafusion]

2025-01-21 Thread via GitHub
comphead commented on code in PR #14214: URL: https://github.com/apache/datafusion/pull/14214#discussion_r1924035829 ## docs/source/user-guide/sql/operators.md: ## @@ -207,6 +208,48 @@ Greater Than or Equal To +--+ ``` +(op_spaceship)= + +### `<=>` + +Thr

Re: [PR] Added documentation for the Spaceship Operator (<=>) [datafusion]

2025-01-21 Thread via GitHub
Spaarsh commented on code in PR #14214: URL: https://github.com/apache/datafusion/pull/14214#discussion_r1924039540 ## docs/source/user-guide/sql/operators.md: ## @@ -207,6 +208,48 @@ Greater Than or Equal To +--+ ``` +(op_spaceship)= + +### `<=>` + +Thre

Re: [PR] Feat: Support array_intersect function [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1271: URL: https://github.com/apache/datafusion-comet/pull/1271#issuecomment-2605560537 > > Thanks @erenavsarogullari. It would be great to have help with this. I will try and add some more notes to the issue with suggestions for how we can improve coverage. >

Re: [I] Move `EnforceSorting` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
alamb commented on issue #14185: URL: https://github.com/apache/datafusion/issues/14185#issuecomment-2605784105 > Sorry I could not look into this afterwards and currently working. I think my PR is ready for merge so maybe we can merge that one and then rebease from main. Otherwise, I can l

Re: [PR] Made imdb download (data_imdb) function atomic [datafusion]

2025-01-21 Thread via GitHub
alamb commented on code in PR #14225: URL: https://github.com/apache/datafusion/pull/14225#discussion_r1924394309 ## benchmarks/bench.sh: ## @@ -536,23 +536,52 @@ data_imdb() { done if [ "$convert_needed" = true ]; then -if [ ! -f "${imdb_dir}/imdb.tgz" ]; th

Re: [PR] Feature Unifying source execution plans [datafusion]

2025-01-21 Thread via GitHub
alamb commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2605797934 Thank you @mertak-synnada -- I plan to study this PR carefully tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Add related source code locations to errors [datafusion]

2025-01-21 Thread via GitHub
eliaperantoni commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1923512091 ## datafusion/sql/src/expr/mod.rs: ## @@ -165,13 +177,126 @@ impl SqlToRel<'_, S> { schema: &DFSchema, planner_context: &mut PlannerContext

Re: [I] Move `EnforceSorting` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
buraksenn commented on issue #14185: URL: https://github.com/apache/datafusion/issues/14185#issuecomment-2604596506 @logan-keede I just saw this comment. I had a test error that I'm fixing now. I think it will be complete in an ahour -- This is an automated message from the Apache Git Ser

Re: [PR] Move EnforceSorting into datafusion-physical-optimizer crate [datafusion]

2025-01-21 Thread via GitHub
buraksenn commented on PR #14219: URL: https://github.com/apache/datafusion/pull/14219#issuecomment-2604038051 I've these errors in replace in `replace_with_order_preserving_variants` tests: ``` expected: [ "SortExec: expr=[a@0 ASC NULLS LAST], preserve_partitioning=[fals

Re: [PR] Add supports for Hive's `SELECT ... GROUP BY .. GROUPING SETS` syntax [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
alamb commented on PR #1653: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1653#issuecomment-2604096899 Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look -- This is an automated messag

Re: [I] Jan 18, 2025: This week(s) in DataFusion [datafusion]

2025-01-21 Thread via GitHub
alamb commented on issue #14179: URL: https://github.com/apache/datafusion/issues/14179#issuecomment-2604361327 @2010YOUY01 became a committer: https://lists.apache.org/thread/b7df0bpzyzzcg6ph50swx7jw0b5dks75 🎉 -- This is an automated message from the Apache Git Service. To respond to t

[PR] National strings: check if dialect supports backslash escape [datafusion-sqlparser-rs]

2025-01-21 Thread via GitHub
hansott opened a new pull request, #1672: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1672 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Add related source code locations to errors [datafusion]

2025-01-21 Thread via GitHub
eliaperantoni commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1923512091 ## datafusion/sql/src/expr/mod.rs: ## @@ -165,13 +177,126 @@ impl SqlToRel<'_, S> { schema: &DFSchema, planner_context: &mut PlannerContext

Re: [I] Move `EnforceSorting` into `datafusion-physical-optimizer` crate [datafusion]

2025-01-21 Thread via GitHub
logan-keede commented on issue #14185: URL: https://github.com/apache/datafusion/issues/14185#issuecomment-2604640452 @buraksenn can you push it directly to #14190, I have already merged your previous commits so it should not cause you much trouble? Thanks -- This is an automated messa

Re: [PR] chore: fix executor build issue on release [datafusion-ballista]

2025-01-21 Thread via GitHub
andygrove merged PR #1167: URL: https://github.com/apache/datafusion-ballista/pull/1167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] Feat: Support arrays_overlap function [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on code in PR #1312: URL: https://github.com/apache/datafusion-comet/pull/1312#discussion_r1924572763 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -2568,4 +2568,21 @@ class CometExpressionSuite extends CometTestBase with Adaptiv

Re: [PR] Feat: Support arrays_overlap function [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on code in PR #1312: URL: https://github.com/apache/datafusion-comet/pull/1312#discussion_r1924572159 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2300,6 +2300,12 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde

Re: [PR] fix: Improve testing for array_remove and fallback to Spark for unsupported types [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on code in PR #1308: URL: https://github.com/apache/datafusion-comet/pull/1308#discussion_r1924581676 ## spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala: ## @@ -2660,4 +2660,19 @@ class CometExpressionSuite extends CometTestBase with

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606127743 @andygrove @parthchandra Why are we adding files like `native/spark-expr/src/strings.rs`? -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on PR #14194: URL: https://github.com/apache/datafusion/pull/14194#issuecomment-2606146741 > Hi @jonahgao , I've fixed the previous workflow failure. Would you be able to approve the latest workflow please? Or would you recommend a set of checks I should do offline first?

Re: [PR] test: interval analysis unit tests [datafusion]

2025-01-21 Thread via GitHub
hiltontj commented on code in PR #14189: URL: https://github.com/apache/datafusion/pull/14189#discussion_r1924611242 ## datafusion/physical-expr/src/analysis.rs: ## @@ -246,3 +246,124 @@ fn calculate_selectivity( acc * cardinality_ratio(&initial.interval, &target.in

Re: [PR] Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on code in PR #14194: URL: https://github.com/apache/datafusion/pull/14194#discussion_r1924615742 ## datafusion/substrait/src/logical_plan/producer.rs: ## @@ -559,12 +559,31 @@ pub fn from_table_scan( let table_schema = scan.source.schema().to_dfschema_re

Re: [PR] Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter [datafusion]

2025-01-21 Thread via GitHub
jonahgao commented on code in PR #14194: URL: https://github.com/apache/datafusion/pull/14194#discussion_r1924616969 ## datafusion/substrait/src/logical_plan/producer.rs: ## @@ -559,12 +559,31 @@ pub fn from_table_scan( let table_schema = scan.source.schema().to_dfschema_re

Re: [PR] Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter [datafusion]

2025-01-21 Thread via GitHub
jamxia155 commented on code in PR #14194: URL: https://github.com/apache/datafusion/pull/14194#discussion_r1924641365 ## datafusion/substrait/src/logical_plan/producer.rs: ## @@ -559,12 +559,31 @@ pub fn from_table_scan( let table_schema = scan.source.schema().to_dfschema_r

Re: [PR] fix concat ws simplify [datafusion]

2025-01-21 Thread via GitHub
cht42 commented on code in PR #14213: URL: https://github.com/apache/datafusion/pull/14213#discussion_r1924657387 ## datafusion/sqllogictest/test_files/expr.slt: ## @@ -465,6 +465,11 @@ SELECT concat_ws('|','a',NULL,NULL) a +query T +SELECT concat_ws('','a','b','c') Re

Re: [PR] Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter [datafusion]

2025-01-21 Thread via GitHub
jamxia155 commented on PR #14194: URL: https://github.com/apache/datafusion/pull/14194#issuecomment-2606046447 Hi @jonahgao , I've fixed the previous workflow failure. Would you be able to approve the latest workflow please? Or would you recommend a set of checks I should do offline first?

Re: [PR] Update logical-types to main [datafusion]

2025-01-21 Thread via GitHub
jayzhan211 commented on PR #14202: URL: https://github.com/apache/datafusion/pull/14202#issuecomment-2606044074 > Thanks! Should I open a new PR so that we can try it with the new branch? What do you mean new branch -- This is an automated message from the Apache Git Service. To res

Re: [I] Automate updating sqllogictest updates [datafusion]

2025-01-21 Thread via GitHub
Omega359 commented on issue #14158: URL: https://github.com/apache/datafusion/issues/14158#issuecomment-2606064028 So I have a bit of an issue with this. I don't know how to implement this without pointing to a branch in my fork of sqllogictest-rs. The code in that branch is not appropriate

Re: [PR] build(deps): bump pprof from 0.13.0 to 0.14.0 in /native [datafusion-comet]

2025-01-21 Thread via GitHub
viirya merged PR #1319: URL: https://github.com/apache/datafusion-comet/pull/1319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dat

Re: [PR] fix: Improve testing for array_remove and fallback to Spark for unsupported types [datafusion-comet]

2025-01-21 Thread via GitHub
viirya commented on code in PR #1308: URL: https://github.com/apache/datafusion-comet/pull/1308#discussion_r1924585115 ## fuzz-testing/src/main/scala/org/apache/comet/fuzz/QueryRunner.scala: ## @@ -64,8 +65,12 @@ object QueryRunner { val sparkRows = df.collect()

[PR] chore: Fix merge conflicts from merging comet-parquet-exec into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove opened a new pull request, #1320: URL: https://github.com/apache/datafusion-comet/pull/1320 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606140490 @andygrove @parthchandra ``` // data source V1 case scanExec @ FileSourceScanExec( HadoopFsRelation(_, partitionSchema, _

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
kazuyukitanimura commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606141084 @andygrove @parthchandra @comphead I think we should revert this change and redo the merge at this point -- This is an automated message from the Apache Git Servic

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606140765 Follow on PR to fix merge issues - https://github.com/apache/datafusion-comet/pull/1320 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606145521 > @andygrove @parthchandra Why are we adding files like `native/spark-expr/src/strings.rs`? There was some refactoring in the past few days in main that resulted in the

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606143968 > This looks `CometScanExec` gets used regardless `COMET_EXEC_ENABLED` and `COMET_NATIVE_SCAN_IMPL` checks The CometScanExec gets replaced later on: ```scala

Re: [PR] chore: merge comet-parquet-exec branch into main [datafusion-comet]

2025-01-21 Thread via GitHub
andygrove commented on PR #1318: URL: https://github.com/apache/datafusion-comet/pull/1318#issuecomment-2606146267 > @andygrove @parthchandra Is the change in `docs/source/contributor-guide/benchmarking.md` relevant? This was a merge issue. Will be fixed in https://github.com/apache

  1   2   >