Re: [I] Add missing scalar functions [datafusion-python]

2025-03-14 Thread via GitHub
deanm commented on issue #463: URL: https://github.com/apache/datafusion-python/issues/463#issuecomment-2725355415 I see cardinality, empty, list_cat, list_concat, list_repeat, make_list, extract, and arrow_cast in functions.py. Only thing missing is trim_array if whoever has control o

Re: [I] Spike: evaluate if cuDF can be used with datafusion-python [datafusion-python]

2025-03-14 Thread via GitHub
deanm commented on issue #936: URL: https://github.com/apache/datafusion-python/issues/936#issuecomment-2725385324 polars has a gpu engine mode. I haven't looked at how they implement it but it's probably worth checking out. -- This is an automated message from the Apache Git Service

Re: [I] Remove all Cargo sub-dependencies? [datafusion-comet]

2025-03-14 Thread via GitHub
andygrove closed issue #1513: Remove all Cargo sub-dependencies? URL: https://github.com/apache/datafusion-comet/issues/1513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] March 4, 2025: This week(s) in DataFusion [datafusion]

2025-03-14 Thread via GitHub
alamb commented on issue #15005: URL: https://github.com/apache/datafusion/issues/15005#issuecomment-2725428604 @XiangpengHao made a new post on more optimal predicate pushdown (we'll get it on in DataFusion finally!) - https://blog.xiangpeng.systems/posts/parquet-pushdown/ -- This is

Re: [PR] fix: remove code duplication in native_datafusion and native_iceberg_compat implementations [datafusion-comet]

2025-03-14 Thread via GitHub
andygrove commented on code in PR #1443: URL: https://github.com/apache/datafusion-comet/pull/1443#discussion_r1996049247 ## native/core/src/execution/planner.rs: ## @@ -181,6 +179,61 @@ impl PhysicalPlanner { } } +/// get DataFusion PartitionedFiles from a S

Re: [PR] Implement `tree` explain for `ArrowFileSink` [datafusion]

2025-03-14 Thread via GitHub
alamb commented on code in PR #15206: URL: https://github.com/apache/datafusion/pull/15206#discussion_r1996139918 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1716,14 +1716,21 @@ TO 'test_files/scratch/explain_tree/1.json'; physical_plan 01)┌─

Re: [PR] Implement tree explain for `LocalLimitExec` [datafusion]

2025-03-14 Thread via GitHub
alamb commented on code in PR #15232: URL: https://github.com/apache/datafusion/pull/15232#discussion_r1996135668 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -246,6 +251,69 @@ physical_plan 11)│format: csv│ 12)└───┘ +

Re: [PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15225: URL: https://github.com/apache/datafusion/pull/15225#issuecomment-2725653649 I took the liberty of applying @Weijun-H 's suggestion and updating the expected output in this PR -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Implement tree explain for `ArrowFileSink` [datafusion]

2025-03-14 Thread via GitHub
alamb closed issue #15112: Implement tree explain for `ArrowFileSink` URL: https://github.com/apache/datafusion/issues/15112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Reject `RESPECT NULLS` and `IGNORE NULLS` for aggregate functions [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15014: URL: https://github.com/apache/datafusion/pull/15014#issuecomment-2725690181 Converting to draft as I think the consensus is that the query is working as designed -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Use insta for `DataFrame` tests [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15165: URL: https://github.com/apache/datafusion/pull/15165#issuecomment-2725675899 Thanks again @blaginin Do you plan to organize some more migration as part of - https://github.com/apache/datafusion/issues/15178 It will likely be a good exercise to

Re: [PR] Simpler to see expressions in tree explain mode [datafusion]

2025-03-14 Thread via GitHub
alamb commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996155083 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │ 0

Re: [PR] fix: nested window function [datafusion]

2025-03-14 Thread via GitHub
alamb commented on code in PR #15033: URL: https://github.com/apache/datafusion/pull/15033#discussion_r1996239262 ## datafusion/core/tests/sql/sql_api.rs: ## @@ -19,6 +19,23 @@ use datafusion::prelude::*; use tempfile::TempDir; +#[tokio::test] +async fn test_window_function

Re: [I] Improve RepartitionExec for better query performance [datafusion]

2025-03-14 Thread via GitHub
pranavJibhakate commented on issue #7001: URL: https://github.com/apache/datafusion/issues/7001#issuecomment-2725340818 @alamb @ozankabak What are the ways to make RoundRobin more NUMA aware. I could come up with only this approach I was reading about how we could pin threads to CPUs s

Re: [PR] Use insta for `DataFrame` tests [datafusion]

2025-03-14 Thread via GitHub
blaginin commented on PR #15165: URL: https://github.com/apache/datafusion/pull/15165#issuecomment-2725900528 Thank you!! Yes, absolutely, i’m going to create a few good first issues now -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on code in PR #15201: URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996463077 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> { \n Limit: skip=0, fetch=

Re: [I] Building project takes a *long* time (esp compilation time for `datafusion` core crate) [datafusion]

2025-03-14 Thread via GitHub
alamb commented on issue #13814: URL: https://github.com/apache/datafusion/issues/13814#issuecomment-2726072268 > There is some improvement in build time (201sec -> 190 sec-->183sec), for datafusion crate(65->65->49).(some progress!!) Pulling stuff out of core helped! Thank you !

Re: [PR] BigQuery: Add support for `CREATE SCHEMA` options [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
alamb commented on PR #1742: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1742#issuecomment-2726069502 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-14 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996471175 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -739,43 +736,42 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996471597 ## datafusion/expr-common/src/signature.rs: ## @@ -865,6 +867,39 @@ impl Signature { volatility, } } + +/// Specialized Signature

Re: [PR] Add upgrade notes for array signatures [datafusion]

2025-03-14 Thread via GitHub
jkosh44 commented on code in PR #15237: URL: https://github.com/apache/datafusion/pull/15237#discussion_r1996445583 ## docs/source/library-user-guide/upgrading.md: ## @@ -212,4 +212,84 @@ To include special characters (such as newlines via `\n`) you can use an `E` lit Elapsed

[PR] chore(deps): bump aws-config from 1.5.18 to 1.6.0 [datafusion]

2025-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #15222: URL: https://github.com/apache/datafusion/pull/15222 Bumps [aws-config](https://github.com/smithy-lang/smithy-rs) from 1.5.18 to 1.6.0. Commits See full diff in https://github.com/smithy-lang/smithy-rs/commits";>compare view

[PR] chore(deps): bump bzip2 from 0.5.1 to 0.5.2 [datafusion]

2025-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #15221: URL: https://github.com/apache/datafusion/pull/15221 Bumps [bzip2](https://github.com/trifectatechfoundation/bzip2-rs) from 0.5.1 to 0.5.2. Commits https://github.com/trifectatechfoundation/bzip2-rs/commit/f5f9d090d8a43b789ab94

Re: [PR] BigQuery: Add support for `CREATE SCHEMA` options [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
iffyio merged PR #1742: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[I] Bug: reconstruct LogicalPlan::Limit fails due to expressions out of order [datafusion]

2025-03-14 Thread via GitHub
niebayes opened a new issue, #15224: URL: https://github.com/apache/datafusion/issues/15224 ### Describe the bug Say a Limit operator with skip 0 and fetch 5. Calls `LogicalPlan::expressions` to get the expressions in the Limit operator, and then reconstructs it by feeding tho

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-14 Thread via GitHub
eliaperantoni commented on PR #15209: URL: https://github.com/apache/datafusion/pull/15209#issuecomment-2724283974 @onlyjackfrost > the same data type check in other unary operator. I'm not 100% what you mean by that. For `+` you used the existing error that was returned and a

Re: [I] Dataframe: select wildcard not working [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on issue #15218: URL: https://github.com/apache/datafusion/issues/15218#issuecomment-2724329077 Maybe we need this ```rust #[derive(Clone, Debug)] pub enum SelectExpr { Wildcard(WildcardOptions), QualifiedWildcard(TableReference, WildcardOptions

Re: [PR] chore: Attach Diagnostic to "incompatible type in unary expression" error [datafusion]

2025-03-14 Thread via GitHub
onlyjackfrost commented on code in PR #15209: URL: https://github.com/apache/datafusion/pull/15209#discussion_r1995299494 ## datafusion/sql/src/expr/unary_op.rs: ## @@ -45,7 +45,13 @@ impl SqlToRel<'_, S> { { Ok(operand) } e

Re: [PR] WIP: Move catalog_common out of core [datafusion]

2025-03-14 Thread via GitHub
logan-keede commented on code in PR #15193: URL: https://github.com/apache/datafusion/pull/15193#discussion_r1995329466 ## datafusion/core/src/lib.rs: ## @@ -699,7 +699,7 @@ pub const DATAFUSION_VERSION: &str = env!("CARGO_PKG_VERSION"); extern crate core; extern crate sqlpar

Re: [PR] Implement tree explain for CoalescePartitionsExec [datafusion]

2025-03-14 Thread via GitHub
Weijun-H commented on code in PR #15225: URL: https://github.com/apache/datafusion/pull/15225#discussion_r1995330384 ## datafusion/physical-plan/src/coalesce_partitions.rs: ## @@ -92,10 +92,12 @@ impl DisplayAs for CoalescePartitionsExec { } Non

Re: [PR] Implement `tree` explain for `ArrowFileSink` [datafusion]

2025-03-14 Thread via GitHub
irenjj commented on code in PR #15206: URL: https://github.com/apache/datafusion/pull/15206#discussion_r1995358596 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1711,35 +1711,58 @@ physical_plan query TT explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2,

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-14 Thread via GitHub
eliaperantoni commented on PR #15143: URL: https://github.com/apache/datafusion/pull/15143#issuecomment-2724184123 Congratulations @changsun20 on merging your first DataFusion PR! Really superb work ❤️ -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] WIP: Move catalog_common out of core [datafusion]

2025-03-14 Thread via GitHub
Weijun-H commented on code in PR #15193: URL: https://github.com/apache/datafusion/pull/15193#discussion_r1995225644 ## datafusion/core/src/lib.rs: ## @@ -699,7 +699,7 @@ pub const DATAFUSION_VERSION: &str = env!("CARGO_PKG_VERSION"); extern crate core; extern crate sqlparser

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-14 Thread via GitHub
Weijun-H merged PR #15143: URL: https://github.com/apache/datafusion/pull/15143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-14 Thread via GitHub
Weijun-H commented on PR #15143: URL: https://github.com/apache/datafusion/pull/15143#issuecomment-2724120852 Thanks @changsun20 , @alamb and @eliaperantoni . I am very excited to see these informative features 😮 -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Fix invalid schema for unions in ViewTables [datafusion]

2025-03-14 Thread via GitHub
jonahgao commented on code in PR #15135: URL: https://github.com/apache/datafusion/pull/15135#discussion_r1995225073 ## datafusion/expr/src/logical_plan/builder.rs: ## @@ -776,8 +777,32 @@ impl LogicalPlanBuilder { &missing_cols, is_distinct, )

Re: [I] Bug: reconstruct LogicalPlan::Limit fails due to expressions out of order [datafusion]

2025-03-14 Thread via GitHub
jonahgao commented on issue #15224: URL: https://github.com/apache/datafusion/issues/15224#issuecomment-2724149322 Can't reproduce on the main branch. ``` running 1 test test logical_plan::plan::tests::test_reconstruct_limit ... ok ``` -- This is an automated message from the A

Re: [I] distinct_query_sql benchmark is failing [datafusion]

2025-03-14 Thread via GitHub
zhuqi-lucas commented on issue #15213: URL: https://github.com/apache/datafusion/issues/15213#issuecomment-2724064713 Same error for me when try to run topk_aggregate bench: ```rust cargo bench -p datafusion --bench topk_aggregate --profile release-nonlto Finished `release-no

Re: [I] Attach `Diagnostic` to "more than one column in subquery" error [datafusion]

2025-03-14 Thread via GitHub
Weijun-H closed issue #14438: Attach `Diagnostic` to "more than one column in subquery" error URL: https://github.com/apache/datafusion/issues/14438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] BigQuery: Add support for `CREATE SCHEMA` options [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
iffyio commented on code in PR #1742: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1742#discussion_r1994954819 ## tests/sqlparser_postgres.rs: ## @@ -988,6 +988,7 @@ fn parse_create_schema_if_not_exists() { Statement::CreateSchema { if_not_e

Re: [PR] Add blog link to `EquivalenceProperties` docs [datafusion]

2025-03-14 Thread via GitHub
berkaysynnada merged PR #15215: URL: https://github.com/apache/datafusion/pull/15215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Add DataFrame fill_nan/fill_null [datafusion-python]

2025-03-14 Thread via GitHub
kosiew commented on PR #1019: URL: https://github.com/apache/datafusion-python/pull/1019#issuecomment-2723770280 [The upstream PR for fill_null](https://github.com/apache/datafusion/pull/14769) is included in datafusion 46.0.0. We can revisit this when datafusion-python upgrade the depe

Re: [I] Add pytest-asyncio unit tests [datafusion-python]

2025-03-14 Thread via GitHub
kosiew commented on issue #991: URL: https://github.com/apache/datafusion-python/issues/991#issuecomment-2723815613 @jsai28 Yes, you can. Just enter a comment ``` take ``` to assign this issue to yourself. -- This is an automated message from the Apache Git Serv

Re: [PR] Add `CASE` and `IF` statement support [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
alamb commented on code in PR #1741: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1741#discussion_r1994218388 ## src/ast/spans.rs: ## @@ -732,6 +735,53 @@ impl Spanned for CreateIndex { } } +impl Spanned for CaseStatement { +fn span(&self) -> Span { +

[PR] Enable `used_underscore_binding` clippy lint [datafusion]

2025-03-14 Thread via GitHub
Shreyaskr1409 opened a new pull request, #15189: URL: https://github.com/apache/datafusion/pull/15189 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/14649. ## Rationale for this change To catch used underscore bindings in the codeb

Re: [PR] Int64 as default type for make_array function empty or null case [datafusion]

2025-03-14 Thread via GitHub
joroKr21 commented on code in PR #10790: URL: https://github.com/apache/datafusion/pull/10790#discussion_r1987015574 ## datafusion/functions-array/src/set_ops.rs: ## @@ -259,6 +259,17 @@ fn generic_set_lists( return general_array_distinct::(l, &field); } +//

Re: [PR] feat: Attach `Diagnostic` to more than one column errors in scalar_subquery and in_subquery [datafusion]

2025-03-14 Thread via GitHub
eliaperantoni commented on PR #15143: URL: https://github.com/apache/datafusion/pull/15143#issuecomment-2723926476 @changsun20 That's great to hear, I'm very very excited that you're signing up for GSoC! I'm gonna a mentor there :) And I'm also delighted to hear that you'd like to work on `

Re: [PR] Minor: split datafusion-cli testing into its own CI job [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15075: URL: https://github.com/apache/datafusion/pull/15075#issuecomment-2724569583 > lgtm thanks @alamb its about time, for each platform we recompile and run tests for datafusion cli, although just having a linux is enough THanks -- filed a ticket to track thi

Re: [PR] Move catalog_common out of core [datafusion]

2025-03-14 Thread via GitHub
Weijun-H merged PR #15193: URL: https://github.com/apache/datafusion/pull/15193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] [branch-46] Fix broken `serde` feature (#15124) [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15227: URL: https://github.com/apache/datafusion/pull/15227#issuecomment-2724703905 Security Audit CI should be fixed by - https://github.com/apache/datafusion/pull/15228 -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Minor: split datafusion-cli testing into its own CI job [datafusion]

2025-03-14 Thread via GitHub
alamb merged PR #15075: URL: https://github.com/apache/datafusion/pull/15075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] fix: compound_field_access doesn't identifier qualifier. [datafusion]

2025-03-14 Thread via GitHub
jonahgao commented on code in PR #15153: URL: https://github.com/apache/datafusion/pull/15153#discussion_r1995820915 ## datafusion/core/tests/sql/select.rs: ## @@ -350,3 +350,45 @@ async fn test_version_function() { assert_eq!(version.value(0), expected_version); } + +#[

Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub
chenkovsky commented on PR #15212: URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2725105874 > > for stackoverflow problem. do you have any idea? > > No, I don't have any idea currently 🤔. I'm not sure, but I guess `recursive_protection` is the right direction.

Re: [PR] chore(deps): bump aws-config from 1.5.18 to 1.6.0 [datafusion]

2025-03-14 Thread via GitHub
comphead merged PR #15222: URL: https://github.com/apache/datafusion/pull/15222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore: fix issue in release process [datafusion-comet]

2025-03-14 Thread via GitHub
parthchandra merged PR #1528: URL: https://github.com/apache/datafusion-comet/pull/1528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [PR] fix: compound_field_access doesn't identifier qualifier. [datafusion]

2025-03-14 Thread via GitHub
jonahgao commented on code in PR #15153: URL: https://github.com/apache/datafusion/pull/15153#discussion_r1995837380 ## datafusion/sql/src/expr/mod.rs: ## @@ -983,14 +983,102 @@ impl SqlToRel<'_, S> { Ok(Expr::Cast(Cast::new(Box::new(expr), dt))) } +/// Extra

Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub
goldmedal commented on code in PR #15212: URL: https://github.com/apache/datafusion/pull/15212#discussion_r1995622142 ## datafusion/sql/tests/cases/plan_to_sql.rs: ## @@ -1746,3 +1749,153 @@ fn test_unparse_subquery_alias_with_table_pushdown() -> Result<()> { assert_eq!(sq

Re: [I] Make global context easier to access for users [datafusion-python]

2025-03-14 Thread via GitHub
timsaucer closed issue #1045: Make global context easier to access for users URL: https://github.com/apache/datafusion-python/issues/1045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Simpler to see expressions in tree explain mode [datafusion]

2025-03-14 Thread via GitHub
irenjj commented on PR #15163: URL: https://github.com/apache/datafusion/pull/15163#issuecomment-2724520622 Thanks @alamb , have added an example for Projection, it looks better than before. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [I] Implement tree explain for InterleaveExec [datafusion]

2025-03-14 Thread via GitHub
alamb closed issue #15196: Implement tree explain for InterleaveExec URL: https://github.com/apache/datafusion/issues/15196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] [branch-46] Update ring to v0.17.13 (#15063) [datafusion]

2025-03-14 Thread via GitHub
xudong963 commented on PR #15228: URL: https://github.com/apache/datafusion/pull/15228#issuecomment-2724916905 Let's go! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Renaming Internal Structs [datafusion-python]

2025-03-14 Thread via GitHub
Spaarsh commented on PR #1059: URL: https://github.com/apache/datafusion-python/pull/1059#issuecomment-2724624458 @timsaucer any suggestions on what other internal class to rename next? If I am correct, all the classes in `_internal` here should be renamed next: https://github.com/apach

Re: [PR] [branch-46] Fix broken `serde` feature (#15124) [datafusion]

2025-03-14 Thread via GitHub
xudong963 commented on PR #15227: URL: https://github.com/apache/datafusion/pull/15227#issuecomment-2724958382 Strange, can't rebase the branch-46 to the PR even if the branch-46 has been updated -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] chore: Prepare for 0.8.0 development [datafusion-comet]

2025-03-14 Thread via GitHub
codecov-commenter commented on PR #1530: URL: https://github.com/apache/datafusion-comet/pull/1530#issuecomment-2725179241 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1530?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [I] Attach `Diagnostic` to "incompatible type in unary expression" error [datafusion]

2025-03-14 Thread via GitHub
onlyjackfrost commented on issue #14433: URL: https://github.com/apache/datafusion/issues/14433#issuecomment-2725209746 @eliaperantoni for the others unary expression. - `Not`: I didn't see any error that could attach a diagnostic with. - `Minus`: - I would like to attach diagn

Re: [I] [DISCUSSION] physical-plan-common crate and Revert the datasource - physical-plan Dependency [datafusion]

2025-03-14 Thread via GitHub
berkaysynnada commented on issue #15111: URL: https://github.com/apache/datafusion/issues/15111#issuecomment-2724593747 > [@berkaysynnada](https://github.com/berkaysynnada) is there any particular problem (like you are trying to implement some feature that you can not) you are trying to sol

Re: [PR] chore: Reimplement ShuffleWriterExec using interleave_record_batch [datafusion-comet]

2025-03-14 Thread via GitHub
andygrove commented on PR #1511: URL: https://github.com/apache/datafusion-comet/pull/1511#issuecomment-2724616413 I plan on starting to review this next week since I am busy with the 0.7.0 release at the moment. -- This is an automated message from the Apache Git Service. To respond to

[PR] Implement tree explain for `LocalLimitExec` [datafusion]

2025-03-14 Thread via GitHub
shruti2522 opened a new pull request, #15232: URL: https://github.com/apache/datafusion/pull/15232 ## Which issue does this PR close? - Closes #15025 . ## Rationale for this change ## What changes are included in this PR? ## Are these change

Re: [I] Make global context easier to access for users [datafusion-python]

2025-03-14 Thread via GitHub
timsaucer commented on issue #1045: URL: https://github.com/apache/datafusion-python/issues/1045#issuecomment-2725082592 Closed by #1060 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Bug: reconstruct LogicalPlan::Limit fails due to expressions out of order [datafusion]

2025-03-14 Thread via GitHub
niebayes closed issue #15224: Bug: reconstruct LogicalPlan::Limit fails due to expressions out of order URL: https://github.com/apache/datafusion/issues/15224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Snowflake: Support dollar quoted comment when creating tables, views, and their fields [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
iffyio commented on code in PR #1755: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1755#discussion_r1995813212 ## tests/sqlparser_snowflake.rs: ## @@ -976,6 +976,27 @@ fn parse_sf_create_or_replace_with_comment_for_snowflake() { } } +#[test] +fn parse_sf

Re: [PR] chore: Drop support for Spark 3.3 (EOL) [datafusion-comet]

2025-03-14 Thread via GitHub
codecov-commenter commented on PR #1529: URL: https://github.com/apache/datafusion-comet/pull/1529#issuecomment-2725149648 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1529?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] Add all missing table options to be handled in any order [datafusion-sqlparser-rs]

2025-03-14 Thread via GitHub
mvzink commented on code in PR #1747: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1747#discussion_r1995867380 ## src/parser/mod.rs: ## @@ -6928,13 +6874,122 @@ impl<'a> Parser<'a> { }; } +let plain_options = self.parse_plain_optio

[PR] chore(deps): bump tokio-util from 0.7.13 to 0.7.14 [datafusion]

2025-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #15223: URL: https://github.com/apache/datafusion/pull/15223 Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.13 to 0.7.14. Commits https://github.com/tokio-rs/tokio/commit/b663abe09199c63f18abf5cc024b01fdc71553a4";>b663ab

Re: [I] Bug: reconstruct LogicalPlan::Limit fails due to expressions out of order [datafusion]

2025-03-14 Thread via GitHub
niebayes commented on issue #15224: URL: https://github.com/apache/datafusion/issues/15224#issuecomment-2724218464 @jonahgao It's proved my codebase is stale. Sorry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] chore: re-enable GitHub discussions [datafusion-comet]

2025-03-14 Thread via GitHub
andygrove closed pull request #1532: chore: re-enable GitHub discussions URL: https://github.com/apache/datafusion-comet/pull/1532 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[I] GitHub discussions have been disabled [datafusion-comet]

2025-03-14 Thread via GitHub
andygrove opened a new issue, #1533: URL: https://github.com/apache/datafusion-comet/issues/1533 ### Describe the bug We need to re-enable GitHub discussions via asf.yaml changes https://github.com/apache/infrastructure-asfyaml/tree/ng-parser?tab=readme-ov-file#repository-featu

Re: [I] Building project takes a *long* time (esp compilation time for `datafusion` core crate) [datafusion]

2025-03-14 Thread via GitHub
logan-keede commented on issue #13814: URL: https://github.com/apache/datafusion/issues/13814#issuecomment-2725515168 using following command on release 42.0.0, 45.0.0, and current main. ```bash cargo build -p datafusion-cli --timings -j 10 ``` [42.0.0] ![Image](https://githu

Re: [I] Browser-accessible official DataFusion playground / DataFusion fiddle [datafusion]

2025-03-14 Thread via GitHub
pranavJibhakate commented on issue #13818: URL: https://github.com/apache/datafusion/issues/13818#issuecomment-2725509197 Thanks a lot I will look into the implementation of [parquet-viewer](https://parquet-viewer.xiangpeng.systems/) about how the caching is done. -- This is an automated

Re: [PR] [branch-46] Fix broken `serde` feature (#15124) [datafusion]

2025-03-14 Thread via GitHub
alamb merged PR #15227: URL: https://github.com/apache/datafusion/pull/15227 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] feat: topk functionality for aggregates should support utf8view and largeutf8 [datafusion]

2025-03-14 Thread via GitHub
alamb merged PR #15152: URL: https://github.com/apache/datafusion/pull/15152 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Document guidelines for physical operator yielding [datafusion]

2025-03-14 Thread via GitHub
alamb merged PR #15030: URL: https://github.com/apache/datafusion/pull/15030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] [DISCUSS] Release DataFusion `46.0.1` Patch or `46.1.0` minor release (March 2025) [datafusion]

2025-03-14 Thread via GitHub
alamb commented on issue #15151: URL: https://github.com/apache/datafusion/issues/15151#issuecomment-2725520004 THanks @xudong963 -- I merged the outstanding issues. I think it should be good to go now -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] [branch-46] Fix wasm32 build on version 46 [datafusion]

2025-03-14 Thread via GitHub
alamb merged PR #15229: URL: https://github.com/apache/datafusion/pull/15229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Document guidelines for physical operator yielding [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15030: URL: https://github.com/apache/datafusion/pull/15030#issuecomment-2725520886 🚀 📖 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] union by name doesn't seem to be working correctly [datafusion]

2025-03-14 Thread via GitHub
rkrishn7 commented on issue #15236: URL: https://github.com/apache/datafusion/issues/15236#issuecomment-2726161606 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Avoid casting columns when comparing ints and strings [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on issue #15035: URL: https://github.com/apache/datafusion/issues/15035#issuecomment-2726167036 month_id is integer and "2024" is utf8. In `type_coercion`, we cast month_id to utf8 based on the coercion rule. `CAST(foo.month_id AS Utf8) = Utf8("2024")`

[PR] Support logic optimize rule to pass the case that Utf8view datatype combined with Utf8 datatype [datafusion]

2025-03-14 Thread via GitHub
zhuqi-lucas opened a new pull request, #15239: URL: https://github.com/apache/datafusion/pull/15239 ## Which issue does this PR close? - Closes part of [15096](https://github.com/apache/datafusion/issues/15096) ## Rationale for this change Support logic optimize rule

Re: [I] DataFusion discussions are missing [datafusion]

2025-03-14 Thread via GitHub
2010YOUY01 commented on issue #15235: URL: https://github.com/apache/datafusion/issues/15235#issuecomment-2726205427 It's back 🎉 ![Image](https://github.com/user-attachments/assets/8aba147d-23a7-4c1f-be16-e8f7bfb69fb6) -- This is an automated message from the Apache Git Service. T

[PR] Fix/15236 [datafusion]

2025-03-14 Thread via GitHub
rkrishn7 opened a new pull request, #15242: URL: https://github.com/apache/datafusion/pull/15242 ## Which issue does this PR close? Closes #15236 ## Rationale for this change An assumption made by a predicate while re-writing union inputs is incorrect. Even if an

Re: [PR] Fix/15236 [datafusion]

2025-03-14 Thread via GitHub
rkrishn7 commented on PR #15242: URL: https://github.com/apache/datafusion/pull/15242#issuecomment-2726219639 Was trying to add all the queries from the issue, but ran into problems with `normalize::convert_batches` during the SLTs 🤔 -- This is an automated message from the Apache Git Se

Re: [PR] chore: Upgrade `rand` crate and some other minor crates [datafusion]

2025-03-14 Thread via GitHub
comphead commented on code in PR #14967: URL: https://github.com/apache/datafusion/pull/14967#discussion_r1996412978 ## datafusion/core/tests/parquet/filter_pushdown.rs: ## @@ -65,7 +65,12 @@ fn generate_file(tempdir: &TempDir, props: WriterProperties) -> TestParquetFile t

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-14 Thread via GitHub
alamb commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996130649 ## datafusion/sqllogictest/test_files/array.slt: ## @@ -4408,12 +4422,10 @@ select array_union(arrow_cast([], 'LargeList(Int64)'), arrow_cast([], 'LargeList query

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-14 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996599515 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -704,29 +704,26 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on code in PR #15149: URL: https://github.com/apache/datafusion/pull/15149#discussion_r1996528078 ## datafusion/functions-nested/src/extract.rs: ## @@ -200,6 +199,7 @@ fn array_element_inner(args: &[ArrayRef]) -> Result { let [array, indexes] = take_fu

Re: [PR] Saner handling of nulls inside arrays [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on PR #15149: URL: https://github.com/apache/datafusion/pull/15149#issuecomment-2726151394 When we return `Null`, can we return with type other than null, for example List::i64 for list type, i64 for non-list type. I guess this reduce the null case we need to handle -

Re: [PR] [branch-46] Fix broken `serde` feature (#15124) [datafusion]

2025-03-14 Thread via GitHub
alamb commented on PR #15227: URL: https://github.com/apache/datafusion/pull/15227#issuecomment-2725518675 Ok, looks good from here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Int64 as default type for make_array function empty or null case [datafusion]

2025-03-14 Thread via GitHub
jayzhan211 commented on code in PR #10790: URL: https://github.com/apache/datafusion/pull/10790#discussion_r1996546811 ## datafusion/functions-array/src/make_array.rs: ## @@ -131,6 +131,11 @@ impl ScalarUDFImpl for MakeArray { } } +// Empty array is a special case that i

Re: [PR] Simpler to see expressions in explain `tree` mode [datafusion]

2025-03-14 Thread via GitHub
irenjj commented on code in PR #15163: URL: https://github.com/apache/datafusion/pull/15163#discussion_r1996471175 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -739,43 +736,42 @@ physical_plan 01)┌───┐ 02)│ ProjectionExec │

Re: [PR] Re-enable github discussion [datafusion]

2025-03-14 Thread via GitHub
2010YOUY01 commented on PR #15241: URL: https://github.com/apache/datafusion/pull/15241#issuecomment-2726204144 > Thanks @2010YOUY01! I was just about to work on it, but you already got it done. 👍 😄 Let's merge and check if this fix works. -- This is an automated message from the A

<    1   2   3   >