Re: [I] Release DataFusion `47.0.0` (April 2025) [datafusion]

2025-03-18 Thread via GitHub
shehabgamin commented on issue #15072: URL: https://github.com/apache/datafusion/issues/15072#issuecomment-2735261314 I feel like this may be important enough to try to get into the release. Does anyone else have thoughts? https://github.com/apache/datafusion/issues/15174 -- This i

Re: [PR] Add dynamic pruning filters from TopK state [datafusion]

2025-03-18 Thread via GitHub
adriangb commented on PR #15301: URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2735401711 Inspired by discussion in https://github.com/apache/datafusion/pull/13054 I went with adding this to `ExecutionPlan`. -- This is an automated message from the Apache Git Service.

Re: [PR] Add dynamic pruning filters from TopK state [datafusion]

2025-03-18 Thread via GitHub
adriangb commented on PR #15301: URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2735408031 Tomorrow I plan on doing some tracer bullet testing to see if this approach works at all. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Add dynamic pruning filters from TopK state [datafusion]

2025-03-18 Thread via GitHub
adriangb commented on PR #15301: URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2735413403 cc @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] fix: `core_expressions` feature flag broken, move `overlay` into `core` functions [datafusion]

2025-03-18 Thread via GitHub
alamb commented on PR #15217: URL: https://github.com/apache/datafusion/pull/15217#issuecomment-2734530255 > hey @alamb, I have already added a re-export at the end of `datafusion/functions/src/string/overlay.rs` like this Thanks @shruti2522 - that looks good to me I double ch

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-18 Thread via GitHub
comphead commented on code in PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#discussion_r2001739415 ## native/core/Cargo.toml: ## @@ -77,6 +77,7 @@ jni = { version = "0.21", features = ["invocation"] } lazy_static = "1.4" assertables = "7" hex = "0.4.3" +

Re: [PR] Add WITH ORDER example to blog post [datafusion-site]

2025-03-18 Thread via GitHub
akurmustafa commented on PR #59: URL: https://github.com/apache/datafusion-site/pull/59#issuecomment-2734756045 > > Thanks @alamb, I was working on to add the example you gave ("DataFusion can find / use orderings based on query intermediates"). Should we add this to the document what do yo

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-18 Thread via GitHub
alamb merged PR #15209: URL: https://github.com/apache/datafusion/pull/15209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] chore(deps): bump uuid from 1.15.1 to 1.16.0 [datafusion]

2025-03-18 Thread via GitHub
xudong963 merged PR #15292: URL: https://github.com/apache/datafusion/pull/15292 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] feat: Add `datafusion-spark` crate [datafusion]

2025-03-18 Thread via GitHub
shehabgamin commented on code in PR #15168: URL: https://github.com/apache/datafusion/pull/15168#discussion_r2002067067 ## datafusion/spark/src/function/math/expm1.rs: ## @@ -0,0 +1,169 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic

Re: [PR] Blog post on Parquet pruning in datafusion [datafusion-site]

2025-03-18 Thread via GitHub
comphead commented on code in PR #60: URL: https://github.com/apache/datafusion-site/pull/60#discussion_r2002069929 ## content/blog/2025-03-18-parquet-pruning.md: ## @@ -0,0 +1,111 @@ +--- +layout: post +title: Parquet pruning in DataFusion: Read Only What Matters +date: 2025-03

Re: [PR] Improvement/improve wildcard error 15004 [datafusion]

2025-03-18 Thread via GitHub
Jiashu-Hu commented on code in PR #15287: URL: https://github.com/apache/datafusion/pull/15287#discussion_r2001662346 ## datafusion/sql/src/select.rs: ## @@ -826,6 +827,13 @@ impl SqlToRel<'_, S> { .map(|expr| rebase_expr(expr, &aggr_projection_exprs, input))

[PR] Parse `SUBSTR` as alias for `SUBSTRING` [datafusion-sqlparser-rs]

2025-03-18 Thread via GitHub
mvzink opened a new pull request, #1769: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1769 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001376856 ## datafusion/core/tests/parquet/custom_reader.rs: ## @@ -96,17 +97,15 @@ async fn route_data_access_ops_to_parquet_file_reader_factory() { let task_ctx = ses

Re: [PR] Add CatalogProvider and SchemaProvider to FFI Crate [datafusion]

2025-03-18 Thread via GitHub
timsaucer merged PR #15280: URL: https://github.com/apache/datafusion/pull/15280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Add WITH ORDER example to blog post [datafusion-site]

2025-03-18 Thread via GitHub
akurmustafa commented on PR #59: URL: https://github.com/apache/datafusion-site/pull/59#issuecomment-2734859868 With the [commit](https://github.com/apache/datafusion-site/pull/59/commits/85eea6a572f95972a155ee9926319112e7149ce8), I have added the @alamb's suggestion to the post. -- This

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-18 Thread via GitHub
andygrove commented on code in PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#discussion_r2002089283 ## native/core/src/execution/planner.rs: ## @@ -3004,4 +3006,130 @@ mod tests { type_info: None, } } + +#[test] +fn test_c

Re: [PR] Add WITH ORDER example to blog post [datafusion-site]

2025-03-18 Thread via GitHub
Omega359 commented on code in PR #59: URL: https://github.com/apache/datafusion-site/pull/59#discussion_r2002099018 ## content/images/ordering_analysis/query_window_plan.png: ## Review Comment: At the output of the window function the table has the ordering: -- This is

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-18 Thread via GitHub
comphead commented on PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456#issuecomment-2734862629 Thanks everyone -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: add read array support [datafusion-comet]

2025-03-18 Thread via GitHub
comphead merged PR #1456: URL: https://github.com/apache/datafusion-comet/pull/1456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@d

Re: [PR] docs: Use a shallow clone for Spark SQL test instructions [datafusion-comet]

2025-03-18 Thread via GitHub
andygrove merged PR #1547: URL: https://github.com/apache/datafusion-comet/pull/1547 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [I] Blog for DataFusion 46.0.0 [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15053: URL: https://github.com/apache/datafusion/issues/15053#issuecomment-2734564305 Thanks @berkaysynnada In general I suggest emphasizing things that many users of the crate will see / appreciate and mentioning, but not too deeply, things that developers

Re: [PR] feat: support merge for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on code in PR #15296: URL: https://github.com/apache/datafusion/pull/15296#discussion_r2002299377 ## datafusion/expr-common/src/statistics.rs: ## @@ -857,6 +857,143 @@ pub fn compute_variance( ScalarValue::try_from(target_type) } +/// Merges two distr

Re: [PR] feat: support merge for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on code in PR #15296: URL: https://github.com/apache/datafusion/pull/15296#discussion_r2002309255 ## datafusion/expr-common/src/statistics.rs: ## @@ -857,6 +857,143 @@ pub fn compute_variance( ScalarValue::try_from(target_type) } +/// Merges two distr

Re: [I] Support `merge` for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on issue #15290: URL: https://github.com/apache/datafusion/issues/15290#issuecomment-2732391652 There is a proposal: https://github.com/apache/datafusion/pull/15296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
UBarney commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2000992780 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,423 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstGroupsAccumulator +wh

Re: [I] Failed optimizations with Int64 type [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15291: URL: https://github.com/apache/datafusion/issues/15291#issuecomment-2732103208 Thanks @aectaan -- what is the error message that you get? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
UBarney commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2000992780 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,423 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstGroupsAccumulator +wh

Re: [PR] fix: Unconditionally wrap UNION BY NAME input nodes w/ `Projection` [datafusion]

2025-03-18 Thread via GitHub
Omega359 commented on code in PR #15242: URL: https://github.com/apache/datafusion/pull/15242#discussion_r2001009802 ## datafusion/sqllogictest/test_files/union_by_name.slt: ## @@ -287,3 +287,137 @@ SELECT '0' as c UNION ALL BY NAME SELECT 0 as c; 0 0 + +# Regression tes

Re: [PR] Migrate datasource tests to insta [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15258: URL: https://github.com/apache/datafusion/pull/15258#discussion_r2001416176 ## datafusion/core/Cargo.toml: ## @@ -126,6 +126,7 @@ datafusion-physical-plan = { workspace = true } datafusion-sql = { workspace = true } flate2 = { version =

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001379803 ## datafusion/core/tests/parquet/schema.rs: ## @@ -153,7 +151,15 @@ async fn schema_merge_can_preserve_metadata() { let actual = df.collect().await.unwrap();

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001384434 ## datafusion/core/tests/sql/path_partition.rs: ## @@ -145,16 +146,7 @@ async fn parquet_distinct_partition_col() -> Result<()> { .collect() .awai

Re: [PR] minor: fix `data/sqlite` link [datafusion]

2025-03-18 Thread via GitHub
sdht0 commented on PR #15286: URL: https://github.com/apache/datafusion/pull/15286#issuecomment-2733877823 Ah fixed it thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2001108185 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -202,53 +202,48 @@ physical_plan 02)│ AggregateExec │ 03)│

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on PR #15268: URL: https://github.com/apache/datafusion/pull/15268#issuecomment-2733344214 Thank you all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Refactor file schema type coercions [datafusion]

2025-03-18 Thread via GitHub
xudong963 merged PR #15268: URL: https://github.com/apache/datafusion/pull/15268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [I] Support `merge` for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on issue #15290: URL: https://github.com/apache/datafusion/issues/15290#issuecomment-2732132955 > Will require an accurate distribution (not just an approximation Yes, it depends on whether each distribution is accurate, if they're, the merged distribution should b

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
Dandandan commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2000574008 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,423 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstGroupsAccumulator

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
xudong963 commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2000832270 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -202,53 +202,48 @@ physical_plan 02)│ AggregateExec │ 03)│

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-03-18 Thread via GitHub
alamb commented on PR #14286: URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2733529188 Converting to a draft until we hav spawn service - https://github.com/apache/arrow-rs/pull/7253 -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-18 Thread via GitHub
jayzhan211 commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r2000904176 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1571,14 +1571,18 @@ async fn with_column_join_same_columns() -> Result<()> { assert_snapshot!(

Re: [PR] Triggering extended tests through PR comment [datafusion]

2025-03-18 Thread via GitHub
Omega359 commented on PR #15101: URL: https://github.com/apache/datafusion/pull/15101#issuecomment-2733251482 Is this ready for review or is there something outstanding for it to be still in draft? -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
irenjj commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2733266739 Thanks @alamb, @jayzhan211 and @xudong963 for your review, here are two points that remain unclear: 1. For GROUP BY, is it necessary to preserve the row index -- for more informati

Re: [PR] minor: fix `data/sqlite` link [datafusion]

2025-03-18 Thread via GitHub
sdht0 commented on PR #15286: URL: https://github.com/apache/datafusion/pull/15286#issuecomment-2733900724 Weird that I received an email about another comment from Weijun-H but can't see it here? -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
irenjj commented on PR #15253: URL: https://github.com/apache/datafusion/pull/15253#issuecomment-2732945218 > Is it possible to modify `Display` for Expr for explain statement? I haven't tried it, not sure how it will affect other logic. -- This is an automated message from the Apac

Re: [PR] feat: Fix multi-lines printing issue for datafusion-cli and add the streaming printing feature back [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #14954: URL: https://github.com/apache/datafusion/pull/14954#discussion_r2001448048 ## datafusion-cli/tests/cli_integration.rs: ## @@ -51,6 +51,163 @@ fn init() { ["--command", "show datafusion.execution.batch_size", "--format", "json", "-q

Re: [I] Require Comet 0.6 Docker image for Spark 3.5.5 [datafusion-comet]

2025-03-18 Thread via GitHub
RaghavendraGanesh commented on issue #1509: URL: https://github.com/apache/datafusion-comet/issues/1509#issuecomment-2733053103 Thanks @andygrove , will give it a try. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [D] enum Expr extension on logical level [datafusion]

2025-03-18 Thread via GitHub
GitHub user bertvermeiren edited a discussion: enum Expr extension on logical level Hi, In order to write some additional logical and physical plan implementations, we do have to create some kind of "composite" logical expression. Nowadays in the code base you have already existing expressi

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001378037 ## datafusion/core/tests/parquet/schema.rs: ## @@ -82,7 +69,18 @@ async fn schema_merge_ignores_metadata_by_default() { .unwrap(); let actual = df.col

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001380462 ## datafusion/core/tests/parquet/schema.rs: ## @@ -167,7 +173,15 @@ async fn schema_merge_can_preserve_metadata() { assert_eq!(actual.clone(), expected_metadat

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001378793 ## datafusion/core/tests/parquet/schema.rs: ## @@ -97,7 +95,18 @@ async fn schema_merge_ignores_metadata_by_default() { .collect() .await

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001386538 ## datafusion/core/tests/sql/path_partition.rs: ## @@ -430,21 +381,7 @@ async fn parquet_multiple_partitions() -> Result<()> { .collect() .await?;

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001387282 ## datafusion/core/tests/sql/select.rs: ## @@ -30,23 +30,7 @@ async fn test_list_query_parameters() -> Result<()> { .with_param_values(vec![ScalarValue::fr

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001385323 ## datafusion/core/tests/sql/path_partition.rs: ## @@ -313,18 +294,7 @@ async fn csv_filter_with_file_nonstring_col() -> Result<()> { .collect()

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001388172 ## datafusion/core/tests/sql/select.rs: ## @@ -114,33 +72,7 @@ async fn test_prepare_statement() -> Result<()> { let dataframe = dataframe.with_param_values(pa

Re: [PR] Migrate tests to insta [datafusion]

2025-03-18 Thread via GitHub
jsai28 commented on code in PR #15288: URL: https://github.com/apache/datafusion/pull/15288#discussion_r2001376856 ## datafusion/core/tests/parquet/custom_reader.rs: ## @@ -96,17 +97,15 @@ async fn route_data_access_ops_to_parquet_file_reader_factory() { let task_ctx = ses

Re: [PR] Migrate user_defined tests to insta [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15255: URL: https://github.com/apache/datafusion/pull/15255#discussion_r2000955601 ## datafusion/core/tests/user_defined/expr_planner.rs: ## @@ -73,52 +73,62 @@ async fn plan_and_collect(sql: &str) -> Result> { ctx.sql(sql).await?.collect(

Re: [PR] Add CatalogProvider and SchemaProvider to FFI Crate [datafusion]

2025-03-18 Thread via GitHub
alamb commented on PR #15280: URL: https://github.com/apache/datafusion/pull/15280#issuecomment-2732820890 🎉 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Migrate user_defined tests to insta [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15255: URL: https://github.com/apache/datafusion/pull/15255#discussion_r2000968071 ## datafusion/core/tests/user_defined/user_defined_table_functions.rs: ## @@ -34,11 +34,19 @@ use datafusion::physical_plan::{collect, ExecutionPlan}; use datafu

Re: [PR] Migrate user_defined tests to insta [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15255: URL: https://github.com/apache/datafusion/pull/15255#discussion_r2000967075 ## datafusion/core/tests/user_defined/user_defined_window_functions.rs: ## @@ -57,30 +57,38 @@ const BOUNDED_WINDOW_QUERY: &str = odd_counter(val) OVER (P

[I] Update all github workflow to use actions tied to sha hashes [datafusion]

2025-03-18 Thread via GitHub
Omega359 opened a new issue, #15298: URL: https://github.com/apache/datafusion/issues/15298 ### Is your feature request related to a problem or challenge? A recent [supply chain attack](https://arstechnica.com/information-technology/2025/03/supply-chain-attack-exposing-credentials-aff

Re: [PR] Add GLOBAL context/modifier to SET statements [datafusion-sqlparser-rs]

2025-03-18 Thread via GitHub
MohamedAbdeen21 commented on code in PR #1767: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1767#discussion_r2000753514 ## src/ast/mod.rs: ## @@ -7919,11 +7921,28 @@ impl fmt::Display for ContextModifier { write!(f, "") }

Re: [I] Update all github workflow to use actions tied to sha hashes [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15298: URL: https://github.com/apache/datafusion/issues/15298#issuecomment-2734055350 Thank you @Omega359 -- I agree this is very important -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Fix predicate pushdown for custom SchemaAdapters [datafusion]

2025-03-18 Thread via GitHub
alamb commented on code in PR #15263: URL: https://github.com/apache/datafusion/pull/15263#discussion_r200126 ## datafusion/core/src/datasource/physical_plan/parquet.rs: ## @@ -224,6 +224,327 @@ mod tests { ) } +#[tokio::test] +async fn test_pushdown_

Re: [I] Add support for S3 Object Store in default binaries [datafusion-ballista]

2025-03-18 Thread via GitHub
milenkovicm commented on issue #1205: URL: https://github.com/apache/datafusion-ballista/issues/1205#issuecomment-2734056479 once we have s3 support it should work with minio. would you be interested in contributing @fithisux ? -- This is an automated message from the Apache Git Service

Re: [I] Update all github workflow to use actions tied to sha hashes [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15298: URL: https://github.com/apache/datafusion/issues/15298#issuecomment-2734058015 I think this is a good first issue as the write up is clear and there is an example to follow -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Failed optimizations with Int64 type [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15291: URL: https://github.com/apache/datafusion/issues/15291#issuecomment-2734059396 Thanks @aectaan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [I] Failed optimizations with Int64 type [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15291: URL: https://github.com/apache/datafusion/issues/15291#issuecomment-2734063672 I wonder if you can get a pure SQL (`datafusion-cli` based) reproducer? Or does it require creating and configuring a custom context / optimizer rules 🤔 -- This is an automated

[PR] feat: support merge for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 opened a new pull request, #15296: URL: https://github.com/apache/datafusion/pull/15296 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/15290 ## Rationale for this change See issue #15290 ## What change

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
2010YOUY01 commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2000593852 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,423 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstGroupsAccumulator

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2000877388 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -202,53 +202,48 @@ physical_plan 02)│ AggregateExec │ 03)│ │

Re: [PR] chore(deps): bump async-trait from 0.1.87 to 0.1.88 [datafusion]

2025-03-18 Thread via GitHub
xudong963 merged PR #15294: URL: https://github.com/apache/datafusion/pull/15294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [D] Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? [datafusion]

2025-03-18 Thread via GitHub
GitHub user dadepo added a comment to the discussion: Does DataFusion Support JSON Path Filtering Like `jsonb_path_exists` in PostgreSQL? > You could create a function called `jsonb_path_exists` that takes a binary > column and a json path string perhaps? I think what I am missing is how this

Re: [I] [EPIC] A collection of tickets for improved WASM support in DataFusion [datafusion]

2025-03-18 Thread via GitHub
savaliyabhargav commented on issue #13815: URL: https://github.com/apache/datafusion/issues/13815#issuecomment-2732178094 @matthewmturner yes sure i am interested can you please give me more detail about it -- This is an automated message from the Apache Git Service. To respond to th

[I] Support `merge` for `Distribution` [datafusion]

2025-03-18 Thread via GitHub
xudong963 opened a new issue, #15290: URL: https://github.com/apache/datafusion/issues/15290 ### Is your feature request related to a problem or challenge? I'm working on the ticket: https://github.com/apache/datafusion/issues/10316. Given that, we'll replace all `Precision` wit

Re: [PR] chore: Update links for released version [datafusion-comet]

2025-03-18 Thread via GitHub
andygrove commented on code in PR #1540: URL: https://github.com/apache/datafusion-comet/pull/1540#discussion_r2001302504 ## docs/source/user-guide/kubernetes.md: ## @@ -65,31 +65,31 @@ metadata: spec: type: Scala mode: cluster - image: ghcr.io/apache/datafusion-comet:sp

Re: [I] Add support for S3 Object Store in default binaries [datafusion-ballista]

2025-03-18 Thread via GitHub
fithisux commented on issue #1205: URL: https://github.com/apache/datafusion-ballista/issues/1205#issuecomment-2733646600 I would rather have minio support because they provide full working docker images that you incorporate in a full fledge docker compose educational stack. -- This is a

Re: [I] Failed optimizations with Int64 type [datafusion]

2025-03-18 Thread via GitHub
aectaan commented on issue #15291: URL: https://github.com/apache/datafusion/issues/15291#issuecomment-2732112476 @alamb it's related to common subexpr eliminate: ``` called `Result::unwrap()` on an `Err` value: Context("Optimizer rule 'common_sub_expression_eliminate' failed", SchemaE

[PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew opened a new pull request, #15295: URL: https://github.com/apache/datafusion/pull/15295 ## Which issue does this PR close? - Closes #14757. ## Rationale for this change This PR introduces a `NestedStructSchemaAdapter` to improve schema evolution handling in DataFu

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2000486806 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,582 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] Enhance Schema adapter to accommodate evolving struct [datafusion]

2025-03-18 Thread via GitHub
kosiew commented on code in PR #15295: URL: https://github.com/apache/datafusion/pull/15295#discussion_r2000486806 ## datafusion/datasource/src/nested_schema_adapter.rs: ## @@ -0,0 +1,582 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2001153997 ## datafusion/physical-plan/src/aggregates/mod.rs: ## @@ -801,6 +803,16 @@ impl DisplayAs for AggregateExec { } } Display

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-18 Thread via GitHub
onlyjackfrost commented on code in PR #15209: URL: https://github.com/apache/datafusion/pull/15209#discussion_r2001158621 ## datafusion/sql/src/expr/unary_op.rs: ## @@ -45,7 +45,18 @@ impl SqlToRel<'_, S> { { Ok(operand) } e

Re: [PR] Add support for `RAISE` statement [datafusion-sqlparser-rs]

2025-03-18 Thread via GitHub
iffyio commented on code in PR #1766: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1766#discussion_r2001170116 ## src/ast/mod.rs: ## @@ -2256,6 +2256,57 @@ impl fmt::Display for ConditionalStatements { } } +/// A `RAISE` statement. +/// +/// Examples: +//

Re: [PR] Simplify display format of `AggregateFunctionExpr`, add `Expr::sql_name` [datafusion]

2025-03-18 Thread via GitHub
irenjj commented on code in PR #15253: URL: https://github.com/apache/datafusion/pull/15253#discussion_r2001176277 ## datafusion/expr/src/expr.rs: ## @@ -2596,6 +2612,176 @@ impl Display for SchemaDisplay<'_> { } } +struct SqlDisplay<'a>(&'a Expr); +impl Display for SqlD

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-18 Thread via GitHub
onlyjackfrost commented on code in PR #15209: URL: https://github.com/apache/datafusion/pull/15209#discussion_r2001158621 ## datafusion/sql/src/expr/unary_op.rs: ## @@ -45,7 +45,18 @@ impl SqlToRel<'_, S> { { Ok(operand) } e

Re: [PR] Add support for `RAISE` statement [datafusion-sqlparser-rs]

2025-03-18 Thread via GitHub
iffyio merged PR #1766: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-18 Thread via GitHub
eliaperantoni commented on code in PR #15209: URL: https://github.com/apache/datafusion/pull/15209#discussion_r2001166961 ## datafusion/sql/src/expr/unary_op.rs: ## @@ -45,7 +45,18 @@ impl SqlToRel<'_, S> { { Ok(operand) } e

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-18 Thread via GitHub
onlyjackfrost commented on code in PR #15209: URL: https://github.com/apache/datafusion/pull/15209#discussion_r2001158621 ## datafusion/sql/src/expr/unary_op.rs: ## @@ -45,7 +45,18 @@ impl SqlToRel<'_, S> { { Ok(operand) } e

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2001206529 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,424 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstPrimitiveGroupsAccum

Re: [PR] Improve performance of `first_value` by implementing special `GroupsAccumulator` [datafusion]

2025-03-18 Thread via GitHub
blaginin commented on code in PR #15266: URL: https://github.com/apache/datafusion/pull/15266#discussion_r2001210201 ## datafusion/functions-aggregate/src/first_last.rs: ## @@ -179,6 +292,424 @@ impl AggregateUDFImpl for FirstValue { } } +struct FirstPrimitiveGroupsAccum

Re: [I] Timeouts reading "large" files from object stores over "slow" connections [datafusion]

2025-03-18 Thread via GitHub
alamb commented on issue #15067: URL: https://github.com/apache/datafusion/issues/15067#issuecomment-2733546979 I am convinced this issue would be solved with automatic retries - https://github.com/apache/arrow-rs/issues/7242 -- This is an automated message from the Apache Git Service.

Re: [PR] Add GLOBAL context/modifier to SET statements [datafusion-sqlparser-rs]

2025-03-18 Thread via GitHub
iffyio commented on code in PR #1767: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1767#discussion_r2000290321 ## src/ast/mod.rs: ## @@ -7919,11 +7921,28 @@ impl fmt::Display for ContextModifier { write!(f, "") } Self::L

[PR] chore(deps): bump uuid from 1.15.1 to 1.16.0 [datafusion]

2025-03-18 Thread via GitHub
dependabot[bot] opened a new pull request, #15292: URL: https://github.com/apache/datafusion/pull/15292 Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.15.1 to 1.16.0. Release notes Sourced from https://github.com/uuid-rs/uuid/releases";>uuid's releases. v1.16.0 What'

[PR] chore(deps): bump rust_decimal from 1.36.0 to 1.37.0 [datafusion]

2025-03-18 Thread via GitHub
dependabot[bot] opened a new pull request, #15293: URL: https://github.com/apache/datafusion/pull/15293 Bumps [rust_decimal](https://github.com/paupino/rust-decimal) from 1.36.0 to 1.37.0. Release notes Sourced from https://github.com/paupino/rust-decimal/releases";>rust_decimal's

[PR] chore(deps): bump async-trait from 0.1.87 to 0.1.88 [datafusion]

2025-03-18 Thread via GitHub
dependabot[bot] opened a new pull request, #15294: URL: https://github.com/apache/datafusion/pull/15294 Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.87 to 0.1.88. Release notes Sourced from https://github.com/dtolnay/async-trait/releases";>async-trait's rel

Re: [PR] fix: Queries similar to `count-bug` produce incorrect results [datafusion]

2025-03-18 Thread via GitHub
suibianwanwank commented on PR #15281: URL: https://github.com/apache/datafusion/pull/15281#issuecomment-2731893665 @alamb Hi, I have updated the PR title to start with "fix," but it seems that the "bug" label has not been added. And the PR is now ready for review, Please take a look at you

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-18 Thread via GitHub
berkaysynnada merged PR #15284: URL: https://github.com/apache/datafusion/pull/15284 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Minor: consistently apply `clippy::clone_on_ref_ptr` in all crates [datafusion]

2025-03-18 Thread via GitHub
berkaysynnada commented on PR #15284: URL: https://github.com/apache/datafusion/pull/15284#issuecomment-2731952749 Thank you @alamb and @comphead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Upgrade Guide for DataFusion 46 does not include the array signatures change [datafusion]

2025-03-18 Thread via GitHub
alamb closed issue #15105: Upgrade Guide for DataFusion 46 does not include the array signatures change URL: https://github.com/apache/datafusion/issues/15105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

  1   2   3   >