Re: [I] Apache DataFusion Google Summer of Code (GSoC) Application Guidelines [datafusion]

2025-02-19 Thread via GitHub
ozankabak commented on issue #14577: URL: https://github.com/apache/datafusion/issues/14577#issuecomment-2667823911 Merged, closing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2667860443 Thank you @2010YOUY01 for review, good suggestion, if i make sense right, so we will control two extra things: 1. --stop-after-max-rows, default: false , we can use this to

[I] Add DataFrame fill_nan [datafusion]

2025-02-19 Thread via GitHub
kosiew opened a new issue, #14770: URL: https://github.com/apache/datafusion/issues/14770 ### Is your feature request related to a problem or challenge? a ### Describe the solution you'd like a ### Describe alternatives you've considered a ### Additio

[PR] Add DataFrame fill_null [datafusion]

2025-02-19 Thread via GitHub
kosiew opened a new pull request, #14769: URL: https://github.com/apache/datafusion/pull/14769 ## Which issue does this PR close? - Closes #14765. ## Rationale for this change The fill_null operation is a common requirement in data processing frameworks l

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
mbrobbel commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961173788 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::

[I] Overflow happened on: -2147483648 % -1 [datafusion]

2025-02-19 Thread via GitHub
kazuyukitanimura opened a new issue, #14771: URL: https://github.com/apache/datafusion/issues/14771 ### Describe the bug Initially reported in Comet https://github.com/apache/datafusion-comet/issues/1412 ``` $datafusion-cli Dat

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
kazuyukitanimura commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2667970729 Filed https://github.com/apache/datafusion/issues/14771 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
kazuyukitanimura commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2667956389 > > wondering if we can fix in arrow/datafusion > > It looks addition is working, I wonder what would be the difference between `+` and `%` > > It seems to be

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
wForget commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2668062863 https://github.com/apache/datafusion/blob/c176533d185b76bf4728c21d3b83ca00c633614f/datafusion/physical-expr/src/expressions/binary.rs#L313 @kazuyukitanimura `Add` without

Re: [I] Add DataFrame fill_nan [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on issue #14770: URL: https://github.com/apache/datafusion/issues/14770#issuecomment-2668069671 A possible implementation is to design a bunch of fill UDFs, such as fill_value, fill_prev, fill_linear etc, which act on a column expression in the select list. The users c

[PR] chore: fix tpch data generator [datafusion-ballista]

2025-02-19 Thread via GitHub
milenkovicm opened a new pull request, #1186: URL: https://github.com/apache/datafusion-ballista/pull/1186 # Which issue does this PR close? Closes #None. # Rationale for this change previous docker container used for tcph generation does not work anymore, thus tpch.sh

Re: [PR] Allow `FileSource`-specific repartitioning [datafusion]

2025-02-19 Thread via GitHub
AdamGS commented on PR #14754: URL: https://github.com/apache/datafusion/pull/14754#issuecomment-2668065514 The CI failure doesn't seem related to the change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] TypeSignature::Coercible for math functions [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #14763: URL: https://github.com/apache/datafusion/issues/14763#issuecomment-2668158147 > `Log` for example can be handled with `TypeSignature::Coercible` where the desired type is float and allow source types are integer. We must not encode coercion rules in

Re: [PR] Specify rust toolchain explicitly, document how to change it [datafusion]

2025-02-19 Thread via GitHub
findepi commented on PR #14655: URL: https://github.com/apache/datafusion/pull/14655#issuecomment-2668166821 Now cargo will auto-update my local toolchain. ``` $ cargo test --test sqllogictests -- map.slt --complete info: syncing channel updates for '1.84.1-aarch64-apple-darwin

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2668192400 I think, it will also benefit the https://github.com/apache/datafusion/issues/14510 -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2668188668 Addressed comments in latest PR @2010YOUY01 ,thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] TypeSignature::Coercible for math functions [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on issue #14763: URL: https://github.com/apache/datafusion/issues/14763#issuecomment-2668242634 > We must not encode coercion rules in every single function. Standard coercion rules must be property of the engine. Not a property of every single function separately.

Re: [I] [EPIC] Decouple logical from physical types [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2668262491 > > I am relying here that we can extract the necessary type information from the schema or record batches > > This is probably not true. Scalar is a free constant unlike

[PR] Create gsoc_project_ideas.md [datafusion]

2025-02-19 Thread via GitHub
oznur-synnada opened a new pull request, #14772: URL: https://github.com/apache/datafusion/pull/14772 Creating new page under Contributor Guide for GSoC 2025 project ideas per Google's request ## Which issue does this PR close? ## Rationale for this change Im

Re: [PR] Create gsoc_project_ideas.md [datafusion]

2025-02-19 Thread via GitHub
oznur-synnada closed pull request #14772: Create gsoc_project_ideas.md URL: https://github.com/apache/datafusion/pull/14772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Runtime-adaptive data representation [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #12720: URL: https://github.com/apache/datafusion/issues/12720#issuecomment-2668289411 > As a general rule we don't support operations on heterogenous types to avoid the combinatorial explosion of codegen that would result, and the corresponding impact on buil

Re: [PR] refactor: move `DataSource` to `datafusion-datasource` [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14671: URL: https://github.com/apache/datafusion/pull/14671#issuecomment-2668289552 > @alamb What do we want to do with benches? (they are causing circular dependency) except that I think this is ready for review. I think we should move the benches into `datafus

Re: [I] TypeSignature::Coercible for math functions [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #14763: URL: https://github.com/apache/datafusion/issues/14763#issuecomment-2668292359 > In an extensible query engine, users should define coercion rules as part of the function definition, specifying how data types are converted Why?? > The engine i

Re: [I] Datafusion can't seem to cast evolving structs [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #14757: URL: https://github.com/apache/datafusion/issues/14757#issuecomment-2668299534 > cc [@alamb](https://github.com/alamb) [@zhuqi-lucas](https://github.com/zhuqi-lucas) many of my users can't query their data because of this evolution. any chance you can take a

[PR] Update index.rst [datafusion]

2025-02-19 Thread via GitHub
oznur-synnada opened a new pull request, #14773: URL: https://github.com/apache/datafusion/pull/14773 Added contributor-guide/gsoc_project_ideas ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included

Re: [I] Simplify `EXPR LIKE 'constant'` to `expr = 'constant'` [datafusion]

2025-02-19 Thread via GitHub
alamb closed issue #13192: Simplify `EXPR LIKE 'constant'` to `expr = 'constant'` URL: https://github.com/apache/datafusion/issues/13192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Allow `FileSource`-specific repartitioning [datafusion]

2025-02-19 Thread via GitHub
mertak-synnada commented on PR #14754: URL: https://github.com/apache/datafusion/pull/14754#issuecomment-2668295526 While refactoring old sources into the `DataSourceExec`, I intended to centralize this file repartition logic for all file types but didn't consider a custom file type in that

Re: [I] Simplify `EXPR LIKE 'constant'` to `expr = 'constant'` [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #13192: URL: https://github.com/apache/datafusion/issues/13192#issuecomment-2668292909 Agreed -- thank you @ngli-me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
alamb commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2668307795 > SQLPage does not have a huge install base, and it took just a few days before the first crash report. DataFusion does have a pretty large user base and we haven't g

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961476992 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coercion

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
wForget commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2668316059 I also tested `wrapping_rem` and it seems to work as expected: ``` let a = -2147483648i32; let b = -1i32; let c = a.wrapping_rem(b); println!("a: {}, b: {}, c:

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961479227 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::

[PR] Create gsoc_project_ideas.md [datafusion]

2025-02-19 Thread via GitHub
oznur-synnada opened a new pull request, #14774: URL: https://github.com/apache/datafusion/pull/14774 Create new page under Contributor Guide for GSoC 2025 ## Which issue does this PR close? ## Rationale for this change Google requested to have project ideas under

Re: [I] Library Guide: Extending DataFusion's operators: custom LogicalPlan and `ExecutionPlans` [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #7308: URL: https://github.com/apache/datafusion/issues/7308#issuecomment-2668316772 > Can I take this issue? Yes of course, please do! Please ping me on the PR if you need a reviewer as improving the DataFusion documentation is high on my list of things to do

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
lovasoa commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2668345274 Yes, I initially thought that the problem was specific to their very restricted environment, but looking at the code in stacker, they crash the entire program on ANY erro

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961488850 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::

Re: [PR] Update index.rst [datafusion]

2025-02-19 Thread via GitHub
berkaysynnada closed pull request #14773: Update index.rst URL: https://github.com/apache/datafusion/pull/14773 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961488850 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::

Re: [I] Datafusion can't seem to handle schema evolution [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #14753: URL: https://github.com/apache/datafusion/issues/14753#issuecomment-2668334767 Possibly related: - https://github.com/apache/datafusion/issues/14755 - https://github.com/apache/datafusion/issues/14757 -- This is an automated message from the Apache Gi

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
niebayes commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961489958 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::

Re: [I] Overflow happened on: -2147483648 % -1 [datafusion-comet]

2025-02-19 Thread via GitHub
parthchandra commented on issue #1412: URL: https://github.com/apache/datafusion-comet/issues/1412#issuecomment-2668355055 Info on rust integer overflow: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md ``` The operations +, -, *, can underflow and overflow.

Re: [I] TypeSignature::Coercible for math functions [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on issue #14763: URL: https://github.com/apache/datafusion/issues/14763#issuecomment-2668355853 Why not? I don't see any problem with my statement -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] feat: add spark_signed_integer_remainder native function for compatibility with spark [datafusion-comet]

2025-02-19 Thread via GitHub
parthchandra commented on PR #1416: URL: https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2668373113 The panic is because of the language specification: https://github.com/apache/datafusion-comet/issues/1412#issuecomment-2668355055 The spec does not explain why but the

Re: [PR] feat: Add ScalarUDF support in FFI crate [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14579: URL: https://github.com/apache/datafusion/pull/14579#discussion_r1961592529 ## datafusion/ffi/src/udf.rs: ## @@ -0,0 +1,346 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See th

Re: [PR] Update Community Events in concepts-readings-events.md [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14629: URL: https://github.com/apache/datafusion/pull/14629#issuecomment-2668524273 > * [Weekly Call for the Community](https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit#heading=h.kpjkpncdmt1g) - I don't know if it'd be ok to share t

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
peter-toth commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2668487879 I didn't notice the missing error handling either. Your https://github.com/rust-lang/stacker/pull/116 seems like a nice improvement. If it doesn't get accepted for som

Re: [I] Add `try_new` for `LogicalPlan::Join` `Join` and others [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #14363: URL: https://github.com/apache/datafusion/issues/14363#issuecomment-2668525129 Thanks @Spaarsh -- I unassigned you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961634449 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [I] Migrate Crypto function to `inovke_with_args` [datafusion]

2025-02-19 Thread via GitHub
goldmedal closed issue #14704: Migrate Crypto function to `inovke_with_args` URL: https://github.com/apache/datafusion/issues/14704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] chore: migrate crypto functions to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
goldmedal commented on PR #14764: URL: https://github.com/apache/datafusion/pull/14764#issuecomment-2668692756 Thanks @jatin510 for reviewing 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] chore: migrate crypto functions to invoke_with_args [datafusion]

2025-02-19 Thread via GitHub
goldmedal merged PR #14764: URL: https://github.com/apache/datafusion/pull/14764 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@data

Re: [PR] Add support for PostgreSQL/Redshift geometric operators [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
iffyio commented on PR #1723: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1723#issuecomment-2668754883 @benrsatori could you take a look at the CI failures? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] [EPIC] Easier extension configuration SessionState / SessionConfig [datafusion]

2025-02-19 Thread via GitHub
alamb closed issue #12550: [EPIC] Easier extension configuration SessionState / SessionConfig URL: https://github.com/apache/datafusion/issues/12550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [EPIC] Easier extension configuration SessionState / SessionConfig [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #12550: URL: https://github.com/apache/datafusion/issues/12550#issuecomment-2668781576 Thanks for the ping @Omega359 -- I agree -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Replace `Method` and `CompositeAccess` with `CompoundFieldAccess` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
alamb commented on PR #1716: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1716#issuecomment-2668778241 Looks like this needs a merge up from main to resolve some conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961771860 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [PR] Add union_tag scalar function [datafusion]

2025-02-19 Thread via GitHub
Omega359 commented on PR #14687: URL: https://github.com/apache/datafusion/pull/14687#issuecomment-2668798412 > Yeah I agree. I think we should file a "discussion" type ticket to have this discussion. I can file one at some point later (I am low on time this week) or if you can that would b

Re: [PR] minor: remove custom extract_ok! macro [datafusion]

2025-02-19 Thread via GitHub
alamb merged PR #14733: URL: https://github.com/apache/datafusion/pull/14733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] [EPIC] Substrait: Add producer and consumer for physical plans [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #5173: URL: https://github.com/apache/datafusion/issues/5173#issuecomment-2668799251 Hi @niebayes -- I recommend coordinating with @vbarua and @Blizzara and @wackywendell , others who I think use substrait with physical plans I think we maybe already have ph

Re: [PR] Reduce size of `Expr` struct [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2668783186 I'll try and find time to run some sql planning benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
findepi commented on code in PR #14689: URL: https://github.com/apache/datafusion/pull/14689#discussion_r1961749786 ## datafusion/functions-aggregate/src/count.rs: ## @@ -139,6 +148,185 @@ impl AggregateUDFImpl for Count { "count" } +fn schema_name(&self, par

Re: [PR] Fix pyarrow test [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #4450: URL: https://github.com/apache/datafusion/pull/4450#discussion_r1961783784 ## .github/workflows/rust.yml: ## @@ -296,7 +296,7 @@ jobs: test-datafusion-pyarrow: name: cargo test pyarrow (amd64) needs: [linux-build-lib] -runs-o

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961784030 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961784030 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [PR] minor: simplify `union_extract` code [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14640: URL: https://github.com/apache/datafusion/pull/14640#issuecomment-2668498226 Thanks @xudong963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove CountWildcardRule in Analyzer and move the functionality in ExprPlanner, add `plan_aggregate` and `plan_window` to planner [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14689: URL: https://github.com/apache/datafusion/pull/14689#issuecomment-2668500874 > We need display name / schema name for WindowFunction as well > > #14750 So close! -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] `AggregateUDFImpl::window_function_schema_name` and `AggregateUDFImpl::window_function_display_name` for window aggregate function [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14750: URL: https://github.com/apache/datafusion/pull/14750#discussion_r1961585600 ## datafusion/expr/src/expr.rs: ## @@ -2621,33 +2650,47 @@ impl Display for Expr { // Expr::ScalarFunction(ScalarFunction { func, args }) => {

Re: [PR] `AggregateUDFImpl::window_function_schema_name` and `AggregateUDFImpl::window_function_display_name` for window aggregate function [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14750: URL: https://github.com/apache/datafusion/pull/14750#issuecomment-2668506982 Since this PR is following existing patterns, and I expect it to be uncontrversial, I merged it in directly -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] `AggregateUDFImpl::window_function_schema_name` and `AggregateUDFImpl::window_function_display_name` for window aggregate function [datafusion]

2025-02-19 Thread via GitHub
alamb merged PR #14750: URL: https://github.com/apache/datafusion/pull/14750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] refactor: move `DataSource` to `datafusion-datasource` [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14671: URL: https://github.com/apache/datafusion/pull/14671#issuecomment-2668508619 Thank you very much for this PR @logan-keede -- I will try and review it later today (I am out of the office this week so I have only limited time to review PRs) -- This is an auto

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
findepi commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961624797 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coercio

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14766: URL: https://github.com/apache/datafusion/pull/14766#discussion_r1961553561 ## datafusion-cli/src/exec.rs: ## @@ -249,8 +255,21 @@ pub(super) async fn exec_and_print( } else { // Bounded stream; collected results are pr

Re: [PR] `AggregateUDFImpl::window_function_schema_name` and `AggregateUDFImpl::window_function_display_name` for window aggregate function [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on PR #14750: URL: https://github.com/apache/datafusion/pull/14750#issuecomment-2668549211 Thanks @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [I] [EPIC] Decouple logical from physical types [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2668553621 > Should type coercion be handled in the physical optimizer, Did you mean physical planner? Physical plan is statically typed, and uses DataType, If we go from LP to PP

Re: [I] [EPIC] Decouple logical from physical types [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2668605924 > Did you mean physical planner? Physical plan is statically typed, and uses DataType, If we go from LP to PP, we need to use types correctly (just as we need to use correc

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961784030 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961784030 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
jayzhan211 commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961784030 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coer

Re: [I] A complete solution for stable and safe sort with spill [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #14692: URL: https://github.com/apache/datafusion/issues/14692#issuecomment-2668835909 Thank you for filing this @zhuqi-lucas . I have also added it to our list here - https://github.com/apache/datafusion/issues/14077 -- This is an automated message from the Ap

Re: [I] Attach `Diagnostic` to "function x does not exist" error [datafusion]

2025-02-19 Thread via GitHub
onlyjackfrost commented on issue #14430: URL: https://github.com/apache/datafusion/issues/14430#issuecomment-2668868355 Hi @eliaperantoni sorry for the late reply. I'm working on this and aim to raise a PR over the weekend. I'll let you know if I have any questions. Thanks for being s

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on code in PR #14766: URL: https://github.com/apache/datafusion/pull/14766#discussion_r1961832072 ## datafusion-cli/src/main.rs: ## @@ -121,6 +121,13 @@ struct Args { )] maxrows: MaxRows, +#[clap( +short, +long, +help

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961792494 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coercion

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on code in PR #14766: URL: https://github.com/apache/datafusion/pull/14766#discussion_r1961835327 ## datafusion-cli/src/exec.rs: ## @@ -249,8 +255,21 @@ pub(super) async fn exec_and_print( } else { // Bounded stream; collected results

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
zhuqi-lucas commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2668895304 > Thank you for working on this @zhuqi-lucas -- I had some comments. Let me know what you think Great idea, thank you @alamb , addressed in latest PR. -- This is an auto

Re: [PR] Signature::Coercible with user defined implicit casting [datafusion]

2025-02-19 Thread via GitHub
findepi commented on code in PR #14440: URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961727057 ## datafusion/expr-common/src/signature.rs: ## @@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) -> Vec { } } +/// Represents type coercio

Re: [I] [EPIC] Decouple logical from physical types [datafusion]

2025-02-19 Thread via GitHub
findepi commented on issue #12622: URL: https://github.com/apache/datafusion/issues/12622#issuecomment-2668739537 > The type coercion in LP only take care about NativeType::String and we end up with `DataType::Utf8` and `DataType::Utf8View`. We consider these resolved types in LP. If

[PR] feat: Add Aggregate UDF to FFI interface [datafusion]

2025-02-19 Thread via GitHub
timsaucer opened a new pull request, #14775: URL: https://github.com/apache/datafusion/pull/14775 ## Which issue does this PR close? - Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these changes teste

Re: [I] Document PREPARE statements [datafusion]

2025-02-19 Thread via GitHub
jonahgao commented on issue #13570: URL: https://github.com/apache/datafusion/issues/13570#issuecomment-2668747633 Maybe we can follow DuckDB's syntax; it seems quite readable to me. https://duckdb.org/docs/sql/query_syntax/prepared_statements.html#named-parameters-parameter

Re: [I] Overflow happened on: -2147483648 % -1 [datafusion]

2025-02-19 Thread via GitHub
wForget commented on issue #14771: URL: https://github.com/apache/datafusion/issues/14771#issuecomment-2668421977 As commented in https://github.com/apache/datafusion-comet/pull/1416#issuecomment-2668062863, `Add` operator has no error because `fail_on_overflow` is `false` by default., sho

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961532221 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::pat

Re: [PR] Create gsoc_project_ideas.md [datafusion]

2025-02-19 Thread via GitHub
berkaysynnada merged PR #14774: URL: https://github.com/apache/datafusion/pull/14774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] fix: Substrait serializer clippy error: not calling truncate [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14723: URL: https://github.com/apache/datafusion/pull/14723#discussion_r1961532725 ## datafusion/substrait/src/serializer.rs: ## @@ -26,28 +26,39 @@ use substrait::proto::Plan; use std::fs::OpenOptions; use std::io::{Read, Write}; +use std::pat

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14766: URL: https://github.com/apache/datafusion/pull/14766#discussion_r1961553561 ## datafusion-cli/src/exec.rs: ## @@ -249,8 +255,21 @@ pub(super) async fn exec_and_print( } else { // Bounded stream; collected results are pr

Re: [PR] Add union_tag scalar function [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14687: URL: https://github.com/apache/datafusion/pull/14687#issuecomment-2668519587 > @alamb - here is another function coming in (xxhash, regexp_extract (both versions of it), array_min/array_max functions) where it is not clear what should be accepted and what shoul

Re: [I] Datafusion listing table evolution is dependent on file order [datafusion]

2025-02-19 Thread via GitHub
alamb commented on issue #14755: URL: https://github.com/apache/datafusion/issues/14755#issuecomment-2668437833 Improving schema evolution seems like a good thing to work on in my opinion -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Add `#[recursive]` [datafusion-sqlparser-rs]

2025-02-19 Thread via GitHub
alamb commented on PR #1522: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1522#issuecomment-2668397910 Interesting -- I haven't looked at the code or what functions are used (and thus under what circumstances such errors happen or how likely they are to occur) -- This is

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #14766: URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2668448654 > It is a problem of datafusion-cli. If datafusion-cli decides to hold all the result batches in memory, it should create a memory consumer for itself and reserve memory for the result

[I] Support IntegralDivide function [datafusion-comet]

2025-02-19 Thread via GitHub
wForget opened a new issue, #1422: URL: https://github.com/apache/datafusion-comet/issues/1422 ### What is the problem the feature request solves? While analyzing #1412, I found that Spark's IntegralDivide function is not yet supported. After #1412 is resolved, I want to try to implem

Re: [PR] feat: Improve datafusion-cli memory usage and considering reserve mem… [datafusion]

2025-02-19 Thread via GitHub
alamb commented on code in PR #14766: URL: https://github.com/apache/datafusion/pull/14766#discussion_r1961557419 ## datafusion-cli/src/main.rs: ## @@ -121,6 +121,13 @@ struct Args { )] maxrows: MaxRows, +#[clap( +short, +long, +help = "Wh

Re: [PR] Add sum statistics and PhysicalExpr::column_statistics [datafusion]

2025-02-19 Thread via GitHub
alamb commented on PR #13736: URL: https://github.com/apache/datafusion/pull/13736#issuecomment-2668466962 For anyone else following along, the next PR is here: - https://github.com/apache/datafusion/pull/14699 -- This is an automated message from the Apache Git Service. To respond to t

  1   2   3   >