Re: [PR] feat: Support IntegralDivide function [datafusion-comet]

2025-02-19 Thread via GitHub
codecov-commenter commented on PR #1428: URL: https://github.com/apache/datafusion-comet/pull/1428#issuecomment-2670674829 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1428?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Simple Functions Preview [datafusion]

2025-02-19 Thread via GitHub
findepi commented on PR #14668: URL: https://github.com/apache/datafusion/pull/14668#issuecomment-2670665041 Marked this as non-draft. It would be great to have reviews here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
findepi commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962985669 ## datafusion/core/src/execution/context/csv.rs: ## @@ -116,11 +116,11 @@ mod tests { assert_eq!(results.len(), 1); let expected = [ -

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
findepi commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962982802 ## datafusion/functions-aggregate/src/count.rs: ## @@ -139,6 +148,185 @@ impl AggregateUDFImpl for Count { "count" } +fn schema_name(&self, par

Re: [PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
zzzdong commented on code in PR #1732: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732#discussion_r1962930972 ## src/ast/mod.rs: ## @@ -8591,6 +8591,61 @@ pub enum CopyIntoSnowflakeKind { Location, } +/// Index Field +/// +/// This structure used here [

Re: [PR] 14709 : migrated all the UDFS to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14779: URL: https://github.com/apache/datafusion/pull/14779#discussion_r1962924468 ## datafusion/functions/src/unicode/character_length.rs: ## @@ -88,11 +88,7 @@ impl ScalarUDFImpl for CharacterLengthFunc { utf8_to_int_type(&arg_types[0

[PR] feat: Support IntegralDivide function [datafusion-comet]

2025-02-19 Thread via GitHub
wForget opened a new pull request, #1428: URL: https://github.com/apache/datafusion-comet/pull/1428 ## Which issue does this PR close? Closes #1422. ## Rationale for this change Support IntegralDivide function ## What changes are included in this PR?

Re: [PR] Add support for `ORDER BY ALL` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
PokIsemaine commented on code in PR #1724: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1724#discussion_r1962907240 ## src/parser/mod.rs: ## @@ -9191,17 +9191,31 @@ impl<'a> Parser<'a> { pub fn parse_optional_order_by(&mut self) -> Result, ParserError> {

Re: [PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio commented on code in PR #1732: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732#discussion_r1962898767 ## src/ast/mod.rs: ## @@ -8591,6 +8591,61 @@ pub enum CopyIntoSnowflakeKind { Location, } +/// Index Field +/// +/// This structure used here [`

Re: [PR] Add support for `ORDER BY ALL` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
PokIsemaine commented on code in PR #1724: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1724#discussion_r1962885448 ## src/parser/mod.rs: ## @@ -13405,6 +13419,19 @@ impl<'a> Parser<'a> { }) } +pub fn parse_order_by_all(&mut self) -> Result {

Re: [PR] Add support for `ORDER BY ALL` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
PokIsemaine commented on code in PR #1724: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1724#discussion_r1962883847 ## src/parser/mod.rs: ## @@ -9191,17 +9191,31 @@ impl<'a> Parser<'a> { pub fn parse_optional_order_by(&mut self) -> Result, ParserError> {

Re: [PR] Add support for `ORDER BY ALL` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
PokIsemaine commented on code in PR #1724: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1724#discussion_r1962888283 ## src/dialect/duckdb.rs: ## @@ -89,4 +89,9 @@ impl Dialect for DuckDbDialect { fn supports_from_first_select(&self) -> bool { true

Re: [PR] Add support for `ORDER BY ALL` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
PokIsemaine commented on code in PR #1724: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1724#discussion_r1962883847 ## src/parser/mod.rs: ## @@ -9191,17 +9191,31 @@ impl<'a> Parser<'a> { pub fn parse_optional_order_by(&mut self) -> Result, ParserError> {

Re: [PR] Treat COLLATE like any other column option [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio merged PR #1731: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Extend Visitor trait for Value type [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio commented on code in PR #1725: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1725#discussion_r1962877173 ## src/ast/visitor.rs: ## @@ -889,34 +909,74 @@ mod tests { ), ]; for (sql, expected) in tests { -let actual

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
wForget commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2670481731 Thanks @kazuyukitanimura @parthchandra , I will try to fix this issue in upstream. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [I] Project Ideas for GSoC 2025 (Google Summer of Code) [datafusion]

2025-02-19 Thread via GitHub
sidshehria commented on issue #14478: URL: https://github.com/apache/datafusion/issues/14478#issuecomment-2670433562 Hi everyone, I believe improving Python bindings in Apache DataFusion would be a great step forward in making it more accessible to data engineers and analysts. Expand

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2670429714 Fixed the row count in latest PR, thanks @2010YOUY01 ! It's the same count before this PR. ```rust /usr/bin/time -l cargo run --release -- --mem-pool-type fair -m 5G

[PR] Add support column prefix index for MySQL [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
zzzdong opened a new pull request, #1732: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1732 This pull request add support for column prefixe and functional indexes in MySQL's `CREATE TABLE` and `ALTER TABLE` statements. ```sql ALTER TABLE `tbl_1` ADD KEY `idx_1` (col_1(

Re: [I] Migrate Unicode function to `invoke_with_args` [datafusion]

2025-02-19 Thread via GitHub
sidshehria commented on issue #14709: URL: https://github.com/apache/datafusion/issues/14709#issuecomment-2670405671 done and resolved this issue kindly review my pull request [https://github.com/apache/datafusion/pull/14779](url) -- This is an automated message from the Apache Git Serv

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2670405399 > Don't break early makes sense to me, I believe it's intended in most cases, let's keep it simple. > > If the query is selecting the whole table, this result does not look

Re: [PR] 14709 : migrated all the UDFS to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
sidshehria commented on code in PR #14779: URL: https://github.com/apache/datafusion/pull/14779#discussion_r1962821119 ## datafusion/functions/src/unicode/character_length.rs: ## @@ -88,7 +88,7 @@ impl ScalarUDFImpl for CharacterLengthFunc { utf8_to_int_type(&arg_types[

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
parthchandra commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2670401123 > `wrapping_rem` seems promising. On top we may need to test with both ANSI and non-ANSI mode Yes, I agree that wrapping_rem seems like a reasonable way to address this.

Re: [PR] fix: fix various unit test failures in native_datafusion and native_iceberg_compat readers [datafusion-comet]

2025-02-19 Thread via GitHub
parthchandra commented on code in PR #1415: URL: https://github.com/apache/datafusion-comet/pull/1415#discussion_r1962790306 ## spark/src/main/scala/org/apache/comet/parquet/CometParquetPartitionReaderFactory.scala: ## @@ -71,14 +71,26 @@ case class CometParquetPartitionReaderFa

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
2010YOUY01 commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2670371222 Don't break early makes sense to me, I believe it's intended in most cases, let's keep it simple. If the query is selecting the whole table, this result does not look correc

Re: [PR] 14709 : migrated all the UDFS to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14779: URL: https://github.com/apache/datafusion/pull/14779#discussion_r1962766206 ## datafusion/functions/src/unicode/character_length.rs: ## @@ -88,7 +88,7 @@ impl ScalarUDFImpl for CharacterLengthFunc { utf8_to_int_type(&arg_types[0]

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jonahgao commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962798687 ## datafusion/functions-aggregate/src/count.rs: ## @@ -359,6 +547,15 @@ impl AggregateUDFImpl for Count { } } +fn is_count_wildcard(args: &[Expr]) -> bool

Re: [PR] chore: fix docker [datafusion-comet]

2025-02-19 Thread via GitHub
codecov-commenter commented on PR #1427: URL: https://github.com/apache/datafusion-comet/pull/1427#issuecomment-2670360974 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1427?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962792402 ## datafusion/functions-aggregate/src/count.rs: ## @@ -359,6 +547,15 @@ impl AggregateUDFImpl for Count { } } +fn is_count_wildcard(args: &[Expr]) -> bo

Re: [I] [EPIC] Substrait: Add producer and consumer for physical plans [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on issue #5173: URL: https://github.com/apache/datafusion/issues/5173#issuecomment-2670348600 @alamb Thanks for your advice. I would first pick a few small tickets to be more familiar with the codebase. -- This is an automated message from the Apache Git Service. To res

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jonahgao commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962777189 ## datafusion/functions-aggregate/src/count.rs: ## @@ -359,6 +547,15 @@ impl AggregateUDFImpl for Count { } } +fn is_count_wildcard(args: &[Expr]) -> bool

Re: [PR] 14709 : migrated all the UDFS to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14779: URL: https://github.com/apache/datafusion/pull/14779#discussion_r1962766206 ## datafusion/functions/src/unicode/character_length.rs: ## @@ -88,7 +88,7 @@ impl ScalarUDFImpl for CharacterLengthFunc { utf8_to_int_type(&arg_types[0]

Re: [PR] Add `statistics_truncate_length` parquet writer config [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on PR #14782: URL: https://github.com/apache/datafusion/pull/14782#issuecomment-2670303512 > failed to install component: 'clippy-preview-x86_64-unknown-linux-gnu', detected conflict: 'bin/cargo-clippy' Seems there's a race condition on rustup. See: https://github.com/

Re: [PR] [infra] Fail Clippy on rust build warnings [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu commented on PR #1029: URL: https://github.com/apache/datafusion-python/pull/1029#issuecomment-2670300780 ah I was going to rebase after #1030 is merged, but this works too. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add `statistics_truncate_length` parquet writer config [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14782: URL: https://github.com/apache/datafusion/pull/14782#discussion_r1962758678 ## datafusion/common/src/config.rs: ## @@ -503,6 +503,10 @@ config_namespace! { /// (writing) Sets column index truncate length pub column_index

Re: [PR] [infra] Fail Clippy on rust build warnings [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer merged PR #1029: URL: https://github.com/apache/datafusion-python/pull/1029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [I] CI should fail if rust warnings are generated [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer closed issue #1028: CI should fail if rust warnings are generated URL: https://github.com/apache/datafusion-python/issues/1028 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Add `statistics_truncate_length` parquet writer config [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14782: URL: https://github.com/apache/datafusion/pull/14782#discussion_r1962758678 ## datafusion/common/src/config.rs: ## @@ -503,6 +503,10 @@ config_namespace! { /// (writing) Sets column index truncate length pub column_index

Re: [PR] feat: scalar regex match physical expr [datafusion]

2025-02-19 Thread via GitHub
github-actions[bot] commented on PR #12270: URL: https://github.com/apache/datafusion/pull/12270#issuecomment-2670230447 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] fix: graceful NULL and type error handling in array functions [datafusion]

2025-02-19 Thread via GitHub
alan910127 commented on code in PR #14737: URL: https://github.com/apache/datafusion/pull/14737#discussion_r1962752150 ## datafusion/expr-common/src/signature.rs: ## @@ -358,6 +358,8 @@ pub enum ArrayFunctionArgument { /// An argument of type List/LargeList/FixedSizeList. A

Re: [PR] Support Null aware anti join by HashJoin [datafusion]

2025-02-19 Thread via GitHub
github-actions[bot] commented on PR #10584: URL: https://github.com/apache/datafusion/pull/10584#issuecomment-2670230511 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [PR] pyo3 update required changes to deprecated interfaces [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer commented on PR #1030: URL: https://github.com/apache/datafusion-python/pull/1030#issuecomment-2670215078 Closing in favor of #1029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] pyo3 update required changes to deprecated interfaces [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer closed pull request #1030: pyo3 update required changes to deprecated interfaces URL: https://github.com/apache/datafusion-python/pull/1030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] pyo3 update required changes to deprecated interfaces [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu commented on PR #1030: URL: https://github.com/apache/datafusion-python/pull/1030#issuecomment-2670195254 I added this PR to #1029 to check CI and everything is green! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] chore: fix docker [datafusion-comet]

2025-02-19 Thread via GitHub
comphead merged PR #1427: URL: https://github.com/apache/datafusion-comet/pull/1427 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [I] CometHashJoin always selects BuildRight which causes potential performance regression [datafusion-comet]

2025-02-19 Thread via GitHub
andygrove commented on issue #1382: URL: https://github.com/apache/datafusion-comet/issues/1382#issuecomment-2670163116 Thanks for writing this up @hayman42. I opened a PR https://github.com/apache/datafusion-comet/pull/1424 to update `RewriteRule` to match latest version in Gluten and I a

[PR] chore: debug no space left on device [datafusion-comet]

2025-02-19 Thread via GitHub
comphead opened a new pull request, #1426: URL: https://github.com/apache/datafusion-comet/pull/1426 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

[PR] Fix docker [datafusion-comet]

2025-02-19 Thread via GitHub
comphead opened a new pull request, #1427: URL: https://github.com/apache/datafusion-comet/pull/1427 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] fix: graceful NULL and type error handling in array functions [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14737: URL: https://github.com/apache/datafusion/pull/14737#discussion_r1962608614 ## datafusion/expr-common/src/signature.rs: ## @@ -358,6 +358,8 @@ pub enum ArrayFunctionArgument { /// An argument of type List/LargeList/FixedSizeList. A

Re: [PR] chore: debug no space left on device [datafusion-comet]

2025-02-19 Thread via GitHub
comphead merged PR #1426: URL: https://github.com/apache/datafusion-comet/pull/1426 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

[PR] Add support for `Dictionary` to AST datatype [datafusion]

2025-02-19 Thread via GitHub
cetra3 opened a new pull request, #14783: URL: https://github.com/apache/datafusion/pull/14783 ## Which issue does this PR close? No issue raised, just something we've seen in production ## Rationale for this change Ensures we can convert dictionary types to ast datatypes

Re: [PR] chore: docker no space left on device [datafusion-comet]

2025-02-19 Thread via GitHub
comphead merged PR #1425: URL: https://github.com/apache/datafusion-comet/pull/1425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

[PR] chore: docker no space left on device [datafusion-comet]

2025-02-19 Thread via GitHub
comphead opened a new pull request, #1425: URL: https://github.com/apache/datafusion-comet/pull/1425 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## How are these changes

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962597381 ## datafusion/functions-aggregate/src/count.rs: ## @@ -139,6 +148,185 @@ impl AggregateUDFImpl for Count { "count" } +fn schema_name(&self,

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962596288 ## datafusion/functions-aggregate/src/count.rs: ## @@ -139,6 +148,185 @@ impl AggregateUDFImpl for Count { "count" } +fn schema_name(&self,

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1962595849 ## datafusion/expr/src/planner.rs: ## @@ -167,14 +170,14 @@ pub trait ExprPlanner: Debug + Send + Sync { /// Plan an extract expression, such as`EXTRACT(

Re: [PR] Examples: boundary analysis example for `AND/OR` conjunctions [datafusion]

2025-02-19 Thread via GitHub
clflushopt commented on PR #14735: URL: https://github.com/apache/datafusion/pull/14735#issuecomment-2670109681 @alamb @berkaysynnada this is the follow up for #14688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [wip] Update RewriteJoin logic [datafusion-comet]

2025-02-19 Thread via GitHub
codecov-commenter commented on PR #1424: URL: https://github.com/apache/datafusion-comet/pull/1424#issuecomment-2670068841 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1424?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
shehabgamin commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1962564677 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coe

[PR] pyo3 update required changes to deprecated interfaces [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer opened a new pull request, #1030: URL: https://github.com/apache/datafusion-python/pull/1030 # Which issue does this PR close? None, but it supports PR https://github.com/apache/datafusion-python/pull/1029 # Rationale for this change pyo3 update recently intro

[PR] [wip] Update RewriteJoin logic [datafusion-comet]

2025-02-19 Thread via GitHub
andygrove opened a new pull request, #1424: URL: https://github.com/apache/datafusion-comet/pull/1424 ## Which issue does this PR close? Related to https://github.com/apache/datafusion-comet/issues/1382 ## Rationale for this change Experimenting at this st

Re: [PR] fix: workaround to get benchmarks working again [datafusion-ballista]

2025-02-19 Thread via GitHub
andygrove closed pull request #1184: fix: workaround to get benchmarks working again URL: https://github.com/apache/datafusion-ballista/pull/1184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] Treat COLLATE like any other column option [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
mvzink opened a new pull request, #1731: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1731 This allows preserving the orderings of column options. We also no longer hardcode a list of dialects which accept a column collation, as it's part of the SQL standard and some di

[PR] Add `statistics_truncate_length` parquet writer config [datafusion]

2025-02-19 Thread via GitHub
akoshchiy opened a new pull request, #14782: URL: https://github.com/apache/datafusion/pull/14782 ## Which issue does this PR close? - Closes #14601. ## Are these changes tested? No. Covered by existing tests. ## Are there any user-facing changes? No

Re: [I] CI should fail if rust warnings are generated [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu commented on issue #1028: URL: https://github.com/apache/datafusion-python/issues/1028#issuecomment-2669443404 I can look into this! I think its just the matter of setting the `-D warnings` flag, like this https://github.com/apache/iceberg-rust/blob/446665652eedc9cd23b9a3ef1

Re: [PR] Allow setting the recursion limit for sql parsing [datafusion]

2025-02-19 Thread via GitHub
cetra3 commented on PR #14756: URL: https://github.com/apache/datafusion/pull/14756#issuecomment-2669866457 @jatin510 This number is actually the default in `sqlparser`: https://github.com/apache/datafusion-sqlparser-rs/blob/b482562618caa3efa89c2f42f87472b00a270926/src/parser/mod.rs#L187

Re: [PR] feat: configure max grpc message size and disable view types in ballista [datafusion-ballista]

2025-02-19 Thread via GitHub
andygrove merged PR #1185: URL: https://github.com/apache/datafusion-ballista/pull/1185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] chore: fix tpch data generator [datafusion-ballista]

2025-02-19 Thread via GitHub
andygrove merged PR #1186: URL: https://github.com/apache/datafusion-ballista/pull/1186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] fix: Specify max message size for Flight client and servers [datafusion-ballista]

2025-02-19 Thread via GitHub
andygrove closed pull request #1183: fix: Specify max message size for Flight client and servers URL: https://github.com/apache/datafusion-ballista/pull/1183 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Failure when fetching shuffle partitions due to exceeding default gRPC message size [datafusion-ballista]

2025-02-19 Thread via GitHub
andygrove closed issue #1182: Failure when fetching shuffle partitions due to exceeding default gRPC message size URL: https://github.com/apache/datafusion-ballista/issues/1182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] feat: improve executor loggers [datafusion-ballista]

2025-02-19 Thread via GitHub
milenkovicm opened a new pull request, #1187: URL: https://github.com/apache/datafusion-ballista/pull/1187 # Which issue does this PR close? Closes #. # Rationale for this change There are few loggers on executor side which print plan as debug, which does not help a

Re: [PR] Set projection before configuring the source [datafusion]

2025-02-19 Thread via GitHub
blaginin commented on PR #14685: URL: https://github.com/apache/datafusion/pull/14685#issuecomment-2669773527 Fair, let's do it that way as a hotfix and properly solve with the builder then 🤝 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[PR] fix: make `serde` feature no_std [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iajoiner opened a new pull request, #1730: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1730 Closes #1729 We use the `serde` feature of `sqlparser` in [SXT Proof of SQL](https://github.com/spaceandtimelabs/sxt-proof-of-sql) and want to ensure no_std support for some of

Re: [PR] Reuse last projection layer when renaming columns [datafusion]

2025-02-19 Thread via GitHub
blaginin commented on code in PR #14684: URL: https://github.com/apache/datafusion/pull/14684#discussion_r1962395050 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1617,9 +1617,19 @@ async fn with_column_renamed() -> Result<()> { // accepts table qualifier .

[PR] Reuse alias if possible [datafusion]

2025-02-19 Thread via GitHub
blaginin opened a new pull request, #14781: URL: https://github.com/apache/datafusion/pull/14781 ## Which issue does this PR close? Follow up on https://github.com/apache/datafusion/pull/14684#discussion_r1957428384 ## Rationale for this change Currently, alias over alia

Re: [PR] Minor: Further Clean-up in Enforce Sorting [datafusion]

2025-02-19 Thread via GitHub
ozankabak merged PR #14732: URL: https://github.com/apache/datafusion/pull/14732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Add a hint about expected extension in error message in register_csv, register_parquet, register_json [datafusion]

2025-02-19 Thread via GitHub
devhprl commented on issue #14144: URL: https://github.com/apache/datafusion/issues/14144#issuecomment-2669590997 @cj-zhukov @alamb hi, I've just hit this one. judging file content by an extension doesn't feel fully right to me (as you can put there basically anything). yes if there is an c

Re: [PR] fix: fix various unit test failures in native_datafusion and native_iceberg_compat readers [datafusion-comet]

2025-02-19 Thread via GitHub
kazuyukitanimura commented on code in PR #1415: URL: https://github.com/apache/datafusion-comet/pull/1415#discussion_r1962234098 ## spark/src/test/scala/org/apache/comet/parquet/ParquetReadSuite.scala: ## @@ -97,13 +102,19 @@ abstract class ParquetReadSuite extends CometTestBase

Re: [I] Introduce Extensions concept to object_store::GetOptions and object_store::PutOptions [datafusion]

2025-02-19 Thread via GitHub
waynr commented on issue #14780: URL: https://github.com/apache/datafusion/issues/14780#issuecomment-2669465459 Whoops, I just realized I opened this issue in the wrong git repo 🤦 . Moving it over to github.com/apache/arrow-rs now. -- This is an automated message from the Apache Git Servi

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
kazuyukitanimura commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2669467230 `wrapping_rem` seems promising. On top we may need to test with both ANSI and non-ANSI mode -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Introduce Extensions concept to object_store::GetOptions and object_store::PutOptions [datafusion]

2025-02-19 Thread via GitHub
waynr closed issue #14780: Introduce Extensions concept to object_store::GetOptions and object_store::PutOptions URL: https://github.com/apache/datafusion/issues/14780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] chore: WIP prepare for upgrading to DataFusion 46 [datafusion-comet]

2025-02-19 Thread via GitHub
codecov-commenter commented on PR #1423: URL: https://github.com/apache/datafusion-comet/pull/1423#issuecomment-2669461585 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1423?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] [infra] Fail Clippy on rust build warnings [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu commented on PR #1029: URL: https://github.com/apache/datafusion-python/pull/1029#issuecomment-2669455899 CI should fail. Now we just need to fix the warnings 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] STRING_AGG missing functionality [datafusion]

2025-02-19 Thread via GitHub
geoffreyclaude commented on code in PR #14412: URL: https://github.com/apache/datafusion/pull/14412#discussion_r1962046382 ## datafusion/functions-aggregate/src/string_agg.rs: ## @@ -106,20 +113,40 @@ impl AggregateUDFImpl for StringAgg { Ok(DataType::LargeUtf8) }

Re: [PR] [infra] Fail Clippy on rust build warnings [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu commented on PR #1029: URL: https://github.com/apache/datafusion-python/pull/1029#issuecomment-2669454698 interestingly, `-D warnings` is set here https://github.com/apache/datafusion-python/blob/40a61c150adee6beb9961302fece81c33639082e/ci/scripts/rust_clippy.sh#L21 -- This

[PR] [infra] Fail Clippy on rust build warnings [datafusion-python]

2025-02-19 Thread via GitHub
kevinjqliu opened a new pull request, #1029: URL: https://github.com/apache/datafusion-python/pull/1029 # Which issue does this PR close? Closes #1028 # Rationale for this change CI should fail if rust warnings are generated # What changes are included in

Re: [PR] fix: fix various unit test failures in native_datafusion and native_iceberg_compat readers [datafusion-comet]

2025-02-19 Thread via GitHub
comphead commented on code in PR #1415: URL: https://github.com/apache/datafusion-comet/pull/1415#discussion_r1962167231 ## common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java: ## @@ -260,7 +260,8 @@ public void init() throws URISyntaxException, IOException {

Re: [PR] fix: fix various unit test failures in native_datafusion and native_iceberg_compat readers [datafusion-comet]

2025-02-19 Thread via GitHub
comphead commented on code in PR #1415: URL: https://github.com/apache/datafusion-comet/pull/1415#discussion_r1962165561 ## common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java: ## @@ -260,7 +260,8 @@ public void init() throws URISyntaxException, IOException {

Re: [PR] feat: Add ScalarUDF support in FFI crate [datafusion]

2025-02-19 Thread via GitHub
timsaucer merged PR #14579: URL: https://github.com/apache/datafusion/pull/14579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Release DataFusion `46.0.0` [datafusion]

2025-02-19 Thread via GitHub
andygrove commented on issue #14123: URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2669417488 I've created a draft PR to upgrade Comet to use latest DataFusion: https://github.com/apache/datafusion-comet/pull/1423 -- This is an automated message from the Apache Git

[I] Introduce Extensions concept to object_store::GetOptions and object_store::PutOptions [datafusion]

2025-02-19 Thread via GitHub
waynr opened a new issue, #14780: URL: https://github.com/apache/datafusion/issues/14780 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** This problem is roughly described in #7135, but essentially we are looking for a way

Re: [I] Migrate Unicode function to `invoke_with_args` [datafusion]

2025-02-19 Thread via GitHub
sidshehria commented on issue #14709: URL: https://github.com/apache/datafusion/issues/14709#issuecomment-2668954194 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] BigQuery: Add support for `BEGIN` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio commented on code in PR #1718: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1718#discussion_r1962136546 ## src/ast/mod.rs: ## @@ -3058,6 +3058,33 @@ pub enum Statement { begin: bool, transaction: Option, modifier: Option, +

Re: [PR] Add support for `EXECUTE IMMEDIATE` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio merged PR #1717: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Reuse last projection layer when renaming columns [datafusion]

2025-02-19 Thread via GitHub
Omega359 commented on code in PR #14684: URL: https://github.com/apache/datafusion/pull/14684#discussion_r1962128561 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1617,9 +1617,19 @@ async fn with_column_renamed() -> Result<()> { // accepts table qualifier .

Re: [I] Discussion: what new functions should and should not be accepted into DataFusion [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #14777: URL: https://github.com/apache/datafusion/issues/14777#issuecomment-2669313528 > can we provide a home for non-core functions where the community could maintain them outside of DataFusion core? you mean something like https://github.com/datafusion-c

Re: [PR] Replace `Method` and `CompositeAccess` with `CompoundFieldAccess` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio merged PR #1716: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] Extending support for INDEX parsing [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio commented on PR #1707: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1707#issuecomment-2669353252 Marking this as draft in the meantime as its no longer awaiting review, @LucaCappelletti94 please feel free to undraft and ping when ready! -- This is an automated messa

[I] CI should fail if rust warnings are generated [datafusion-python]

2025-02-19 Thread via GitHub
timsaucer opened a new issue, #1028: URL: https://github.com/apache/datafusion-python/issues/1028 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Our current release generates warnings during the rust build. These should cause

[PR] chore: WIP prepare for upgrading to DataFusion 46 [datafusion-comet]

2025-02-19 Thread via GitHub
andygrove opened a new pull request, #1423: URL: https://github.com/apache/datafusion-comet/pull/1423 ## Which issue does this PR close? Closes #. ## Rationale for this change Upgrade to latest DataFusion in to make sure that there are no regressions. We

  1   2   3   >