Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
samuelcolvin commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1935157823 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, f

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
berkaysynnada commented on PR #14271: URL: https://github.com/apache/datafusion/pull/14271#issuecomment-2623769912 @alamb could you take a final look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Fix `CREATE FUNCTION` round trip for Hive dialect [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio commented on PR #1693: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1693#issuecomment-2623763546 My bad, I somehow managed to miss that the test was failing before merging -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] Fix `CREATE FUNCTION` round trip for Hive dialect [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio opened a new pull request, #1693: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1693 Fixes the test failure in #1690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on PR #14271: URL: https://github.com/apache/datafusion/pull/14271#issuecomment-2623728047 Thanks for reviewing carefully, as always, much appreciated 🚀 > ```select a, agg(b) FROM ... GROUP BY a ORDER BY a, agg(b)``` You are right that all queries of this fo

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
berkaysynnada commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1935098072 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -238,6 +241,338 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { Ok(())

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
berkaysynnada commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1935090911 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -238,6 +241,338 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { Ok(())

Re: [PR] Fix DDL generation in case of an empty arguments function. [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio merged PR #1690: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] start refactoring process by setting up base + init [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on PR #14306: URL: https://github.com/apache/datafusion/pull/14306#issuecomment-2623485431 @Rachelint I have added the test to CI, Please review it whenever you can find some time. Thanks -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1935083383 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exa

Re: [PR] Allow plain JOIN without turning it into INNER [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio merged PR #1692: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1692 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[PR] 14044/enhancement/add xxhash algorithms in expression api [datafusion]

2025-01-29 Thread via GitHub
Spaarsh opened a new pull request, #14367: URL: https://github.com/apache/datafusion/pull/14367 ## Which issue does this PR close? Closes #14044. ## Rationale for this change Lack of xxhash (a quick, non-cryptographic hashing technique) functions. ## What c

Re: [I] Implement xxhash algorithms as part of the expression API [datafusion]

2025-01-29 Thread via GitHub
Spaarsh commented on issue #14044: URL: https://github.com/apache/datafusion/issues/14044#issuecomment-2623464462 Thanks @HectorPascual! I'm opening PR from here on for transperancy's sake! -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-01-29 Thread via GitHub
rohitrastogi commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1934929386 ## datafusion-examples/examples/thread_pools_lib/dedicated_executor.rs: ## @@ -0,0 +1,1778 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-01-29 Thread via GitHub
rohitrastogi commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1934929386 ## datafusion-examples/examples/thread_pools_lib/dedicated_executor.rs: ## @@ -0,0 +1,1778 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Replace is_sorted helper with standard one. [datafusion]

2025-01-29 Thread via GitHub
github-actions[bot] commented on PR #13608: URL: https://github.com/apache/datafusion/pull/13608#issuecomment-2623347375 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or

Re: [I] Create a wrapper class to access org.apache.arrow.c.SchemaImporter [datafusion-comet]

2025-01-29 Thread via GitHub
parthchandra commented on issue #1352: URL: https://github.com/apache/datafusion-comet/issues/1352#issuecomment-2623341094 +1. CometSchemaImporter need never be exposed to Iceberg. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] [substrait] Add support for ExtensionTable [datafusion]

2025-01-29 Thread via GitHub
vbarua commented on PR #13772: URL: https://github.com/apache/datafusion/pull/13772#issuecomment-2623341026 Apologies for the delay, I haven't had the bandwidth to follow up on this (and I still don't tbh). At this point I'm ambivalent about this capability, but I wouldn't vote again

[I] Create a wrapper class to access org.apache.arrow.c.SchemaImporter [datafusion-comet]

2025-01-29 Thread via GitHub
huaxingao opened a new issue, #1352: URL: https://github.com/apache/datafusion-comet/issues/1352 ### What is the problem the feature request solves? CometSchemaImporter is a Comet class but is in the org.apache.arrow.c package to overcome access restrictions (Arrow's SchemaImporter is

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623202562 Was that working in #14057? I didn't see a test for it. Hypothetically speaking we could do something in DFSchema to deduplicate but I worry that won't make it work e.g. we'll

Re: [I] multiply overflow in stats.rs [datafusion]

2025-01-29 Thread via GitHub
LindaSummer commented on issue #13775: URL: https://github.com/apache/datafusion/issues/13775#issuecomment-2623179970 Hi, Sorry for delay on this issue. I will try to work on it now. 😊 Best Regards, Edward -- This is an automated message from the Apache Git Service.

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623174593 > Is this that important to support? The example seems a bit contrived, I think it'd be more reasonable if it occurred naturally as part of a join or something where a user could

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#discussion_r1934793909 ## native/spark-expr/src/list.rs: ## @@ -708,6 +708,92 @@ impl PartialEq for ArrayInsert { } } +#[derive(Debug, Hash)] +pub struct ArraySize { +

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623148014 I will still give a shot at adding that feature tomorrow. But I'm not sold on the behavior being ideal even if that's what Spark does. Besides if it's an error now we can always mak

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623096983 something like this. ```rust #[tokio::test] async fn test_name_conflict() { let batch = record_batch!( ("_rowid", UInt32, [0, 1, 2]), ("_ro

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623125556 Is this that important to support? The example seems a bit contrived, I think it'd be more reasonable if it occurred naturally as part of a join or something where a user could une

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934746178 ## native/core/src/execution/operators/filter.rs: ## @@ -62,6 +65,8 @@ pub struct FilterExec { default_selectivity: u8, /// Properties equivalence

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622993100 @chenkovsky could you maybe translate that into a test that we can add? I'm having trouble imagining in what sorts of situations this would apply. Generally in SQL if you have two c

Re: [I] Improve Parallel Reading (CSV, JSON) / Help Wanted [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #8723: URL: https://github.com/apache/datafusion/issues/8723#issuecomment-2622766890 If anyone wants a fun exercise, getting the CSV reader to read in parallel from local files owuld greatly speed up the h2o benchmarks -- This is an automated message from the Apac

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622977789 as described in document, https://spark.apache.org/docs/3.5.1/api/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.html If a table column and a metadata c

Re: [I] Type Coercion fails for List with inner type struct which has large/view types [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14154: URL: https://github.com/apache/datafusion/issues/14154#issuecomment-2622959604 Ao the next step for this PR is to find a DataFusion only reproducer that works in DF 43 but not in DF 44. I will try to do so tomorrow -- This is an automated message from the

Re: [I] Type Coercion fails for List with inner type struct which has large/view types [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14154: URL: https://github.com/apache/datafusion/issues/14154#issuecomment-2622957118 Possibly related: - https://github.com/apache/datafusion/pull/12490 (released in DataFusion 44.0.0) - https://github.com/apache/datafusion/pull/13452 (also released in Data

Re: [PR] Script and documentation for regenerating sqlite test files [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on code in PR #14290: URL: https://github.com/apache/datafusion/pull/14290#discussion_r1934713735 ## datafusion/sqllogictest/regenerate_sqlite_files.sh: ## @@ -0,0 +1,179 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or mo

Re: [PR] chore(deps): bump rustyline from 14.0.0 to 15.0.0 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14265: URL: https://github.com/apache/datafusion/pull/14265#issuecomment-2622947328 I am working to keep the dependencies updated and the PR queue lower -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] chore(deps): bump rustyline from 14.0.0 to 15.0.0 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14265: URL: https://github.com/apache/datafusion/pull/14265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Support arrays_overlap function (alias of `array_has_any`) [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14217: URL: https://github.com/apache/datafusion/pull/14217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Improve deprecation message for MemoryExec [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14322: URL: https://github.com/apache/datafusion/pull/14322#issuecomment-2622945474 Thanks @xudong963 and @shehabgamin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Support arrays_overlap function [datafusion]

2025-01-29 Thread via GitHub
alamb closed issue #14216: Support arrays_overlap function URL: https://github.com/apache/datafusion/issues/14216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Minor: include the number of files run in sqllogictest display [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14359: URL: https://github.com/apache/datafusion/pull/14359#discussion_r1934703443 ## datafusion/sqllogictest/bin/sqllogictests.rs: ## @@ -184,7 +186,11 @@ async fn run_tests() -> Result<()> { .collect() .await; -m.println(f

Re: [I] Build time regression [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14256: URL: https://github.com/apache/datafusion/issues/14256#issuecomment-2622937724 > Let me see if I can find ways to make Expr smaller though I can make Expr less than half the size in this PR: - https://github.com/apache/datafusion/pull/14366 I

Re: [PR] Reduce size of `Expr` struct [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14366: URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934699879 ## datafusion/expr/src/expr.rs: ## @@ -297,7 +298,7 @@ pub enum Expr { /// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt AggregateFunction(Aggregate

Re: [PR] Reduce size of `Expr` struct [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14366: URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934669162 ## datafusion/expr/src/expr.rs: ## @@ -3067,4 +3078,19 @@ mod test { rename: opt_rename, } } + +#[test] +fn test_size_of_expr()

Re: [I] Simple Functions [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2622912627 https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/string.rs is a good example of what can result with supporting many types and args -- This is

Re: [PR] Reduce size of `Expr` struct [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14366: URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934670024 ## datafusion/expr/src/logical_plan/plan.rs: ## @@ -2420,19 +2420,24 @@ impl Window { .iter() .enumerate() .filter_map(|(idx

Re: [I] Simple Functions [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2622869956 I'll let @davidhewitt chime in but we've experience a lot of generic bloat from having to implement functions that operate on scalars, arrays, dictionary arrays and take multip

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622850087 > If you are feeling like some more refactoring projects, any chance you are interested in working to split out the data sources (aka make `datafusion-datasource-parquet`, `dataf

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934596188 ## docs/source/user-guide/configs.md: ## @@ -64,6 +64,7 @@ Comet provides the following configuration settings. | spark.comet.explain.native.enabled | When

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on code in PR #14364: URL: https://github.com/apache/datafusion/pull/14364#discussion_r1934571539 ## datafusion/substrait/Cargo.toml: ## @@ -46,6 +46,7 @@ url = { workspace = true } [dev-dependencies] datafusion = { workspace = true, features = ["nested

Re: [I] Build time regression [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14256: URL: https://github.com/apache/datafusion/issues/14256#issuecomment-2622826545 > > After removing the WildcardOptions (by replacing it with an empty structure) I can see the build time drops. Removing the rule itself and the change in core doesn't help. It l

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on code in PR #14364: URL: https://github.com/apache/datafusion/pull/14364#discussion_r1934568525 ## datafusion/catalog/src/lib.rs: ## @@ -18,23 +18,264 @@ //! Interfaces and default implementations of catalogs and schemas. //! //! Implementations +//! *

Re: [PR] refactor: switch `BooleanBufferBuilder` to `NullBufferBuilder` in single_group_by [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14360: URL: https://github.com/apache/datafusion/pull/14360 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14245: URL: https://github.com/apache/datafusion/pull/14245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
alamb closed issue #14204: LimitPushdown rule uncorrect remove some GlobalLimitExec URL: https://github.com/apache/datafusion/issues/14204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14245: URL: https://github.com/apache/datafusion/pull/14245#issuecomment-2622814836 Thanks again @zhuqi-lucas and @xudong963 -- this PR took a while but I think things are good in the end -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Deprecate the use of `datafusion_sql::ResolvedTableReference and TableReference` [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14365: URL: https://github.com/apache/datafusion/pull/14365#discussion_r1934560993 ## datafusion/sql/src/lib.rs: ## @@ -50,5 +50,9 @@ pub mod unparser; pub mod utils; mod values; +#[deprecated( +since = "45.0.0", +note = "use datafusion

[PR] Deprecate the use of `datafusion_sql::ResolvedTableReference and TableReference` [datafusion]

2025-01-29 Thread via GitHub
alamb opened a new pull request, #14365: URL: https://github.com/apache/datafusion/pull/14365 ## Which issue does this PR close? ## Rationale for this change Noticed while working on https://github.com/apache/datafusion/pull/14364 with @logan-keede `datafusion-sq

Re: [I] Build time regression [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14256: URL: https://github.com/apache/datafusion/issues/14256#issuecomment-2622793735 > After removing the WildcardOptions (by replacing it with an empty structure) I can see the build time drops. Removing the rule itself and the change in core doesn't help. It loo

Re: [I] Build time regression [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14256: URL: https://github.com/apache/datafusion/issues/14256#issuecomment-2622793969 Thanks to some great work from @buraksenn @berkaysynnada and @logan-keede we have completed extracting physical optimizer rules: - https://github.com/apache/datafusion/issues/1

Re: [I] Jan 18, 2025: This week(s) in DataFusion [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14179: URL: https://github.com/apache/datafusion/issues/14179#issuecomment-2622790016 Thanks to some great work from @buraksenn @berkaysynnada and @logan-keede we have completed extracting physical optimizer rules: - https://github.com/apache/datafusion/issues/1

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14364: URL: https://github.com/apache/datafusion/pull/14364#discussion_r1934520683 ## datafusion/substrait/Cargo.toml: ## @@ -46,6 +46,7 @@ url = { workspace = true } [dev-dependencies] datafusion = { workspace = true, features = ["nested_expre

Re: [I] [Epic] Extract catalog functionality from the core to make it more modular [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #10782: URL: https://github.com/apache/datafusion/issues/10782#issuecomment-2622779885 > Hi, I am working on moving `InformationSchema` into the `datafusion-catalog`. This would require moving `core/src/datasource/streaming.rs` (`StreaminTable`) to some place out of

Re: [I] Add `enable_url_table` as a argument to SessionStateBuilder [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #12394: URL: https://github.com/apache/datafusion/issues/12394#issuecomment-2622768363 Let's close this one for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] Add `enable_url_table` as a argument to SessionStateBuilder [datafusion]

2025-01-29 Thread via GitHub
alamb closed issue #12394: Add `enable_url_table` as a argument to SessionStateBuilder URL: https://github.com/apache/datafusion/issues/12394 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622752446 I am checking this one out -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Add RETURNS TABLE() support for CREATE FUNCTION in Postgresql [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
remysaissy commented on code in PR #1687: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1687#discussion_r1934516332 ## src/parser/mod.rs: ## @@ -4535,7 +4535,14 @@ impl<'a> Parser<'a> { self.expect_token(&Token::RParen)?; let return_type = if s

Re: [PR] Provide user-defined invariants for logical node extensions. [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14329: URL: https://github.com/apache/datafusion/pull/14329#discussion_r1934505630 ## datafusion/expr/src/logical_plan/extension.rs: ## @@ -54,6 +57,22 @@ pub trait UserDefinedLogicalNode: fmt::Debug + Send + Sync { /// Return the output schem

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r193447 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -238,6 +241,338 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { Ok(()) } +#

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934475131 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRef {

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934480800 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -238,6 +241,338 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { Ok(()) }

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934474112 ## native/spark-expr/src/conversion_funcs/cast.rs: ## @@ -988,6 +988,9 @@ fn is_datafusion_spark_compatible( return true; } match from_typ

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934471695 ## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ## @@ -238,6 +241,338 @@ async fn test_remove_unnecessary_sort5() -> Result<()> { Ok(()) } +#

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622655265 cc @comphead @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934447430 ## docs/source/user-guide/configs.md: ## @@ -64,6 +64,7 @@ Comet provides the following configuration settings. | spark.comet.explain.native.enabled

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934452868 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRe

[PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede opened a new pull request, #14364: URL: https://github.com/apache/datafusion/pull/14364 ## Which issue does this PR close? Closes #10782 ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] equivalence classes: use normalized mapping for projection [datafusion]

2025-01-29 Thread via GitHub
askalt commented on PR #14327: URL: https://github.com/apache/datafusion/pull/14327#issuecomment-2622614689 I added a test to .slt. It was slightly tricky because setting up a table with existing equivalence classes (e.g., a being an alias for b) is not very straightforward. I took advantag

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934442772 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRef {

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934436738 ## native/core/src/execution/operators/scan.rs: ## @@ -304,11 +304,7 @@ fn scan_schema(input_batch: &InputBatch, data_types: &[DataType]) -> SchemaRe

[I] Add `try_new` for `LogicalPlan::Join` `Join` [datafusion]

2025-01-29 Thread via GitHub
phisn opened a new issue, #14363: URL: https://github.com/apache/datafusion/issues/14363 ### Is your feature request related to a problem or challenge? Currently one has to manually add the schema when creating a join or give an empty one and call `recompute_schema`. It would be nice

Re: [I] Result mismatch with vanilla spark in hash function with decimal input [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove closed issue #1294: Result mismatch with vanilla spark in hash function with decimal input URL: https://github.com/apache/datafusion-comet/issues/1294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] fix: Fall back to Spark when hashing decimals with precision > 18 [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on PR #1325: URL: https://github.com/apache/datafusion-comet/pull/1325#issuecomment-2622599915 Thanks for the review @kazuyukitanimura and @parthchandra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] fix: Fall back to Spark when hashing decimals with precision > 18 [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove merged PR #1325: URL: https://github.com/apache/datafusion-comet/pull/1325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
samuelcolvin commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1934433832 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, f

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
gstvg commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1933397382 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exam

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
gstvg commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1934429898 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exam

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934403062 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec:

Re: [PR] fix: pass scale to DF round in spark_round [datafusion-comet]

2025-01-29 Thread via GitHub
kazuyukitanimura commented on code in PR #1341: URL: https://github.com/apache/datafusion-comet/pull/1341#discussion_r1934414738 ## native/spark-expr/src/math_funcs/round.rs: ## @@ -135,3 +136,50 @@ fn decimal_round_f(scale: &i8, point: &i64) -> Box i128> { Box::new(mov

Re: [PR] Document SQL dialect guidance [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #13706: URL: https://github.com/apache/datafusion/pull/13706#discussion_r1934410805 ## docs/source/user-guide/sql/dialect.md: ## @@ -0,0 +1,53 @@ + + +# SQL Dialect + +The included SQL supported in Apache DataFusion mostly follows the [PostgreSQL

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934403062 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec:

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934382053 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec: exp

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622510720 @chenkovsky @jayzhan211 could you please review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] feat: metadata columns [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2622481316 Here's what I think is a much simpler and more flexible change: https://github.com/apache/datafusion/pull/14362 -- This is an automated message from the Apache Git Service. To res

Re: [PR] expose write options [datafusion-python]

2025-01-29 Thread via GitHub
kylebarron commented on code in PR #1006: URL: https://github.com/apache/datafusion-python/pull/1006#discussion_r1934368047 ## src/options.rs: ## @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on code in PR #14362: URL: https://github.com/apache/datafusion/pull/14362#discussion_r1934364024 ## datafusion/expr/src/utils.rs: ## @@ -736,11 +802,18 @@ pub fn exprlist_to_fields<'a>( .into_iter() .map(|c| c.flat_na

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934352834 ## datafusion/expr/src/udaf.rs: ## @@ -818,6 +826,26 @@ pub mod aggregate_doc_sections { }; } +/// Status of an Aggregate Expression's Set Monotonicity +

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934352035 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -605,6 +605,15 @@ object CometConf extends ShimCometConf { .booleanConf .creat

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934351236 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -929,6 +929,19 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde w

Re: [PR] Feature: AggregateMonotonicity [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1934350058 ## datafusion/sqllogictest/test_files/aggregate.slt: ## @@ -6203,3 +6203,97 @@ physical_plan 14)--PlaceholderRowExec 15)ProjectionExec:

Re: [PR] Support marking columns as system columns via metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622449713 I don't know if there's any other "known" metadata, but I feel like it would be good to have an extension trait along the lines of: ```rust /// Extension of [`Field`] to ma

[PR] Support marking columns as system columns via metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb opened a new pull request, #14362: URL: https://github.com/apache/datafusion/pull/14362 Closes #14057 Closes #13975 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

  1   2   3   >