Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1934303034 ## .github/workflows/extended.yml: ## @@ -31,14 +31,38 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pull-r

Re: [PR] Feature: Monotonic Sets [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934314846 ## datafusion/functions-aggregate/src/min_max.rs: ## @@ -1183,6 +1187,10 @@ impl AggregateUDFImpl for Min { fn documentation(&self) -> Option<&Documentation> {

Re: [I] Expose object_store for direct use [datafusion-python]

2025-01-29 Thread via GitHub
kylebarron commented on issue #1008: URL: https://github.com/apache/datafusion-python/issues/1008#issuecomment-2622430293 I agree! This is why I created [obstore](https://github.com/developmentseed/obstore). It's a fast Python binding for `object_store`. > However, it's useful to be

Re: [PR] Feature: Monotonic Sets [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934326726 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec: exp

[PR] Support marking columns as system columns via metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb opened a new pull request, #14362: URL: https://github.com/apache/datafusion/pull/14362 Closes #14057 Closes #13975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Support marking columns as system columns via metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622449713 I don't know if there's any other "known" metadata, but I feel like it would be good to have an extension trait along the lines of: ```rust /// Extension of [`Field`] to ma

Re: [PR] fix: pass scale to DF round in spark_round [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1341: URL: https://github.com/apache/datafusion-comet/pull/1341#discussion_r1934414738 ## native/spark-expr/src/math_funcs/round.rs: ## @@ -135,3 +136,50 @@ fn decimal_round_f(scale: &i8, point: &i64) -> Box i128> { Box::new(mov

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934403062 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec:

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
gstvg commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1934429898 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exam

Re: [I] Result mismatch with vanilla spark in hash function with decimal input [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove closed issue #1294: Result mismatch with vanilla spark in hash function with decimal input URL: https://github.com/apache/datafusion-comet/issues/1294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] fix: Fall back to Spark when hashing decimals with precision > 18 [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on PR #1325: URL: https://github.com/apache/datafusion-comet/pull/1325#issuecomment-2622599915 Thanks for the review @kazuyukitanimura and @parthchandra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
gstvg commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1933397382 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exam

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
samuelcolvin commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1934433832 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, f

Re: [PR] fix: Fall back to Spark when hashing decimals with precision > 18 [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove merged PR #1325: URL: https://github.com/apache/datafusion-comet/pull/1325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[I] Add `try_new` for `LogicalPlan::Join` `Join` [datafusion]

2025-01-29 Thread via GitHub
phisn opened a new issue, #14363: URL: https://github.com/apache/datafusion/issues/14363 ### Is your feature request related to a problem or challenge? Currently one has to manually add the schema when creating a join or give an empty one and call `recompute_schema`. It would be nice

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934436738 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRe

Re: [PR] refactor: switch `BooleanBufferBuilder` to `NullBufferBuilder` in single_group_by [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14360: URL: https://github.com/apache/datafusion/pull/14360#discussion_r1934250018 ## datafusion/physical-plan/src/aggregates/group_values/single_group_by/primitive.rs: ## @@ -166,10 +165,12 @@ where null_idx: Option, ) -> Pri

Re: [PR] chore(deps): update rand requirement from 0.8 to 0.9 [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14333: URL: https://github.com/apache/datafusion/pull/14333#issuecomment-2622197850 The upgrade for rand needs a bit more love, see this PR from @mbrobbel in arrow-rs - https://github.com/apache/arrow-rs/pull/7045 Here is the upgrade guide for anyone who want

Re: [I] [Epic] Extract catalog functionality from the core to make it more modular [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on issue #10782: URL: https://github.com/apache/datafusion/issues/10782#issuecomment-2622328196 Hi, I am working on moving `InformationSchema` into the `datafusion-catalog`. This would require moving `core/src/datasource/streaming.rs` (`StreaminTable`) to some place

Re: [PR] Minor: include the number of files run in sqllogictest display [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14359: URL: https://github.com/apache/datafusion/pull/14359#discussion_r1934295820 ## datafusion/sqllogictest/bin/sqllogictests.rs: ## @@ -184,7 +186,11 @@ async fn run_tests() -> Result<()> { .collect() .await; -m.println

Re: [PR] Script and documentation for regenerating sqlite test files [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on PR #14290: URL: https://github.com/apache/datafusion/pull/14290#issuecomment-2622360492 I was worried that macs would be troublesome for this script. I'll try and adjust things for that platform. the difference in sed is ... unfortunate. The postgres errors might

Re: [PR] feat: metadata columns [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2622481316 Here's what I think is a much simpler and more flexible change: https://github.com/apache/datafusion/pull/14362 -- This is an automated message from the Apache Git Service. To res

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622510720 @chenkovsky @jayzhan211 could you please review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934382053 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec: exp

[PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede opened a new pull request, #14364: URL: https://github.com/apache/datafusion/pull/14364 ## Which issue does this PR close? Closes #10782 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934442772 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRef {

Re: [PR] equivalence classes: use normalized mapping for projection [datafusion]

2025-01-29 Thread via GitHub
askalt commented on PR #14327: URL: https://github.com/apache/datafusion/pull/14327#issuecomment-2622614689 I added a test to .slt. It was slightly tricky because setting up a table with existing equivalence classes (e.g., a being an alias for b) is not very straightforward. I took advantag

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934452868 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRe

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934447430 ## docs/source/user-guide/configs.md: ## @@ -64,6 +64,7 @@ Comet provides the following configuration settings. | spark.comet.explain.native.enabled

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622655265 cc @comphead @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[I] Expose object_store for direct use [datafusion-python]

2025-01-29 Thread via GitHub
matko opened a new issue, #1008: URL: https://github.com/apache/datafusion-python/issues/1008 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I need to be able to delete old resources generated by `write_parquet()` and similar met

Re: [PR] Add RETURNS TABLE() support for CREATE FUNCTION in Postgresql [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio commented on code in PR #1687: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1687#discussion_r1934243820 ## src/parser/mod.rs: ## @@ -8866,6 +8873,24 @@ impl<'a> Parser<'a> { Ok((data, trailing_bracket)) } +fn parse_returns_table_column(

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1934244570 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14356: URL: https://github.com/apache/datafusion/pull/14356#discussion_r1934247452 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2645,6 +2643,106 @@ pub struct Union { pub schema: DFSchemaRef, } +impl Union { +/// Constructs new

Re: [PR] Fix bug when parsing a Snowflake stage with `;` suffix [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio merged PR #1688: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] moving memory.rs out of datafusion/core [datafusion]

2025-01-29 Thread via GitHub
comphead commented on PR #14332: URL: https://github.com/apache/datafusion/pull/14332#issuecomment-2622109979 > One way to review this is to run cargo --doc --open and then looking at how it is rendered Oh this is much easier way than I kept doing before, thanks -- This is an autom

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
findepi commented on PR #14356: URL: https://github.com/apache/datafusion/pull/14356#issuecomment-2622185174 Thanks @Omega359 for your review, addressed! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Script and documentation for regenerating sqlite test files [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14290: URL: https://github.com/apache/datafusion/pull/14290#discussion_r1934283379 ## datafusion/sqllogictest/regenerate_sqlite_files.sh: ## @@ -0,0 +1,179 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more

Re: [PR] Fix build "missing field `sum_value` in initializer of `ColumnStatistics`" [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14345: URL: https://github.com/apache/datafusion/pull/14345#issuecomment-2622107396 > Thank you for the quick fix! Well, I also broke it 😆 I don't want to be lauded as a fire fighter if I was also the one starting the fires 😆 🚒 I saw this a few t

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on PR #14356: URL: https://github.com/apache/datafusion/pull/14356#issuecomment-2622107955 I ran the sqlite tests against this branch with no changes so the tests in there did not cover this case. -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934403062 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec:

Re: [PR] Document SQL dialect guidance [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #13706: URL: https://github.com/apache/datafusion/pull/13706#discussion_r1934410805 ## docs/source/user-guide/sql/dialect.md: ## @@ -0,0 +1,53 @@ + + +# SQL Dialect + +The included SQL supported in Apache DataFusion mostly follows the [PostgreSQL

Re: [PR] Provide user-defined invariants for logical node extensions. [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14329: URL: https://github.com/apache/datafusion/pull/14329#discussion_r1934505630 ## datafusion/expr/src/logical_plan/extension.rs: ## @@ -54,6 +57,22 @@ pub trait UserDefinedLogicalNode: fmt::Debug + Send + Sync { /// Return the output schem

Re: [PR] Add RETURNS TABLE() support for CREATE FUNCTION in Postgresql [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
remysaissy commented on code in PR #1687: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1687#discussion_r1934516332 ## src/parser/mod.rs: ## @@ -4535,7 +4535,14 @@ impl<'a> Parser<'a> { self.expect_token(&Token::RParen)?; let return_type = if s

<    1   2   3