Re: [PR] feat: metadata columns [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2621372943 I also have an another approach. ```rust struct ColumnIndex { pub index: usize, pub is_metadata_column: bool } impl Into for ColumnIndex { }

[PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
findepi opened a new pull request, #14356: URL: https://github.com/apache/datafusion/pull/14356 This commit fixes two bugs related to UNION handling - when constructing union plan nullability of the other union branch was ignored, thus resulting field could easily have incorrect n

Re: [PR] Feature: Monotonic Sets [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on code in PR #14271: URL: https://github.com/apache/datafusion/pull/14271#discussion_r1933977485 ## datafusion/physical-expr/src/window/standard.rs: ## @@ -65,33 +65,19 @@ impl StandardWindowExpr { &self.expr } -/// Adds any equivalent or

Re: [PR] Fix incorrect searched CASE optimization [datafusion]

2025-01-29 Thread via GitHub
findepi closed pull request #14349: Fix incorrect searched CASE optimization URL: https://github.com/apache/datafusion/pull/14349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Implement physical optimizer rule to apply type coercion to physical plans [datafusion]

2025-01-29 Thread via GitHub
andygrove commented on issue #14324: URL: https://github.com/apache/datafusion/issues/14324#issuecomment-2621810595 > BTW this might be fairly straightforward to add as an extension in comet (maybe it doesn't have to be in the datafusion core if it is only used by systems that use the physi

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1934000529 ## .github/workflows/extended.yml: ## @@ -31,14 +31,38 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pul

[I] Decimal promotion is not applied consistently [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove opened a new issue, #1350: URL: https://github.com/apache/datafusion-comet/issues/1350 ### Describe the bug In `QueryPlanSerde`, we have code in `exprToProto` to ensure the decimal precision is the same for both children of a binary expression. ```scala def exprT

Re: [PR] Add related source code locations to errors [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1933723286 ## datafusion/common/src/diagnostic.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] Add related source code locations to errors [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #13664: URL: https://github.com/apache/datafusion/pull/13664#discussion_r1933725543 ## datafusion/common/src/diagnostic.rs: ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreement

Re: [PR] add manual trigger for extended tests in pull requests [datafusion]

2025-01-29 Thread via GitHub
edmondop commented on code in PR #14331: URL: https://github.com/apache/datafusion/pull/14331#discussion_r1933841345 ## .github/workflows/extended.yml: ## @@ -31,14 +31,38 @@ on: push: branches: - main + issue_comment: +types: [created] + +permissions: + pul

Re: [I] Implement physical optimizer rule to apply type coercion to physical plans [datafusion]

2025-01-29 Thread via GitHub
edmondop commented on issue #14324: URL: https://github.com/apache/datafusion/issues/14324#issuecomment-2621615427 Is it possible to do type coercion only at the physical layer? Or does it bring benefits at the logical layer so that we need to do type coercion twice, once at the logical lay

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
zhuqi-lucas commented on code in PR #14245: URL: https://github.com/apache/datafusion/pull/14245#discussion_r1934090147 ## datafusion/physical-optimizer/src/limit_pushdown.rs: ## @@ -248,7 +247,15 @@ pub fn pushdown_limit_helper( } } else { //

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
zhuqi-lucas commented on code in PR #14245: URL: https://github.com/apache/datafusion/pull/14245#discussion_r1934092407 ## datafusion/physical-optimizer/src/limit_pushdown.rs: ## @@ -248,7 +247,15 @@ pub fn pushdown_limit_helper( } } else { //

Re: [I] Implement Common Subexpression Elimination optimizer rule [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on issue #942: URL: https://github.com/apache/datafusion-comet/issues/942#issuecomment-2621952623 The DataFusion PR https://github.com/apache/datafusion/pull/13046 is still waiting for a review. I am adding this issue back onto the 0.6 milestone as a reminder. -- Thi

[PR] Fix DDL generation in case of an empty arguments FUNCTION. [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
remysaissy opened a new pull request, #1690: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1690 Functions with an empty argument list are properly parsed into the AST but the string representation of such function removes the parenthesis required after the function name, causi

Re: [PR] fix: fetch is missed during EnforceDistribution [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14207: URL: https://github.com/apache/datafusion/pull/14207#issuecomment-2621967843 I merged up from main to resolve a conflict on this PR as part of my review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[PR] refactor: switch `BooleanBufferBuilder` to `NullBufferBuilder` in single_group_by [datafusion]

2025-01-29 Thread via GitHub
Chen-Yuan-Lai opened a new pull request, #14360: URL: https://github.com/apache/datafusion/pull/14360 ## Which issue does this PR close? Closes #14115 . ## Rationale for this change As mentioned in #14115 , several examples in DataFusion codebase still u

Re: [I] Decimal promotion is not applied consistently [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove closed issue #1350: Decimal promotion is not applied consistently URL: https://github.com/apache/datafusion-comet/issues/1350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1933570686 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch))

Re: [PR] Fix Float and Decimal coercion [datafusion]

2025-01-29 Thread via GitHub
findepi commented on PR #14273: URL: https://github.com/apache/datafusion/pull/14273#issuecomment-2621162538 I am OK having DF coercions extensible (https://github.com/apache/datafusion/issues/14296), but first and foremost I want them to be **_optional_**. At ~SDF~ dbt, we create fully res

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1933573232 ## datafusion/functions-nested/src/extract.rs: ## @@ -330,7 +330,8 @@ pub(super) struct ArraySlice { impl ArraySlice { pub fn new() -> Self { Self {

[PR] Add hook for sharing join state in distributed execution [datafusion]

2025-01-29 Thread via GitHub
thinkharderdev opened a new pull request, #12523: URL: https://github.com/apache/datafusion/pull/12523 ## Which issue does this PR close? Closes #12454 ## Rationale for this change ## What changes are included in this PR? ## Are these chang

Re: [I] support: Date +/plus Int or date_add function [datafusion]

2025-01-29 Thread via GitHub
DanCodedThis commented on issue #6876: URL: https://github.com/apache/datafusion/issues/6876#issuecomment-2621860544 @edmondop I think I agree with your statement. - Also, does the new `TypeSignatureClass` + `logical_string()` allow parsing `month` as not a column/keyword, but as `Utf8`

Re: [PR] equivalence classes: use normalized mapping for projection [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on PR #14327: URL: https://github.com/apache/datafusion/pull/14327#issuecomment-2621860123 I am guessing what is meant was writing an SLT test whose output plan would change if this logic/fix weren't there. Maybe an optimization that wouldn't take place because of the no

[PR] Minor: include the number of files run in sqllogictest display [datafusion]

2025-01-29 Thread via GitHub
alamb opened a new pull request, #14359: URL: https://github.com/apache/datafusion/pull/14359 ## Which issue does this PR close? - Follow on to https://github.com/apache/datafusion/pull/14355 - Follow on to ## Rationale for this change While testing https://github.com/apa

Re: [PR] Minor: include the number of files run in sqllogictest display [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14359: URL: https://github.com/apache/datafusion/pull/14359#discussion_r1934030696 ## datafusion/sqllogictest/bin/sqllogictests.rs: ## @@ -491,9 +497,7 @@ impl TestFile { } } -fn read_test_files<'a>( -options: &'a Options, -) -> Result

Re: [PR] Restore ability to run single SLT file [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14355: URL: https://github.com/apache/datafusion/pull/14355#issuecomment-2621874234 > The functionality was probably accidentally lost probably in #13936 BTW I think it is currently possible to run a single file. For example this will run both `union.slt` and `p

Re: [PR] test: attempt to analyze boundaries for select columns [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on PR #14308: URL: https://github.com/apache/datafusion/pull/14308#issuecomment-2621875650 Seems like we need to make certain view types comparable with their parent types. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] chore(deps): bump serde_json from 1.0.137 to 1.0.138 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14351: URL: https://github.com/apache/datafusion/pull/14351 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14356: URL: https://github.com/apache/datafusion/pull/14356#issuecomment-2621877094 @wiedld can you also please review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Feature Unifying source execution plans [datafusion]

2025-01-29 Thread via GitHub
ozankabak commented on PR #14224: URL: https://github.com/apache/datafusion/pull/14224#issuecomment-2621880797 Thanks for all the review. We will focus on this again next week, improve the PR, and ask for more thoughts from the community. -- This is an automated message from the Apache Gi

Re: [PR] chore(deps): bump tempfile from 3.15.0 to 3.16.0 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14350: URL: https://github.com/apache/datafusion/pull/14350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [I] Decimal promotion is not applied consistently [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on issue #1350: URL: https://github.com/apache/datafusion-comet/issues/1350#issuecomment-2621887848 nm, I see now that `DecimalPrecision.promote` calls `expr.transformUp` and transforms the whole expression tree. -- This is an automated message from the Apache Git Ser

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
zhuqi-lucas commented on code in PR #14245: URL: https://github.com/apache/datafusion/pull/14245#discussion_r1933509296 ## datafusion/core/tests/dataframe/mod.rs: ## @@ -5182,3 +5177,32 @@ async fn register_non_parquet_file() { "1.json' does not match the expected exten

Re: [PR] bug: Fix NULL handling in array_slice [datafusion]

2025-01-29 Thread via GitHub
jkosh44 commented on PR #14289: URL: https://github.com/apache/datafusion/pull/14289#issuecomment-2621158643 > ```rust > /// Any Null input causes the function to return Null. > Propogate, > ``` I've updated the PR to use this. I just realized though that window and aggre

Re: [I] Add `DataFrame::map` utility .map function for DataFrame for modifying internal LogicalPlan [datafusion]

2025-01-29 Thread via GitHub
phisn commented on issue #14317: URL: https://github.com/apache/datafusion/issues/14317#issuecomment-2621740237 @timsaucer My specific use case is what @alamb described. The problem with the transform approach is that I am forced to `.logical_plan().clone` as well as create a new `DataFrame

Re: [I] Ensure `to_timestamp` behaves consistently with PostgreSQL [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on issue #13351: URL: https://github.com/apache/datafusion/issues/13351#issuecomment-2621744869 > That is a very good question. There must have been I reason I didn't do that when I coded this up but right now I can't recall why it would have been. Let me think about t

[I] Enable TableScan to return multiple arbitrary table references [datafusion]

2025-01-29 Thread via GitHub
phisn opened a new issue, #14358: URL: https://github.com/apache/datafusion/issues/14358 ### Is your feature request related to a problem or challenge? `TableProvider` has a type which indicates which type of table it contains. This can be a view. Therefore it makes sense to encode ma

Re: [PR] Fix UNION field nullability tracking [datafusion]

2025-01-29 Thread via GitHub
findepi commented on PR #14356: URL: https://github.com/apache/datafusion/pull/14356#issuecomment-2621752435 cc @felipecrv, @Omega359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1933951486 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -3490,3 +3437,6 @@ trait CometExpressionSerde { inputs: Seq[Attribute],

Re: [PR] Restore ability to run single SLT file [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on PR #14355: URL: https://github.com/apache/datafusion/pull/14355#issuecomment-2621782574 I made a small PR to your branch to show the # of files tested, verifying that `cargo test --test sqllogictests -- union.slt` runs 2 files whereas `cargo test -

Re: [PR] feat: metadata columns [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2621231579 > > Then we cannot make sure metadata columns always have same column index in different tables > > Not quite understand the reason, with modified metadata, meta column coul

[PR] Add RETURNS TABLE() support for CREATE FUNCTION in Postgresql [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
remysaissy opened a new pull request, #1687: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1687 Currently CREATE FUNCTION doesn't support the RETURNS TABLE datatype (https://www.postgresql.org/docs/15/sql-createfunction.html). This PR adds it for the PostgresSQL and Gene

Re: [PR] BigQuery: Fix column identifier reserved keywords list [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
alamb merged PR #1678: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

Re: [I] support: Date +/plus Int or date_add function [datafusion]

2025-01-29 Thread via GitHub
edmondop commented on issue #6876: URL: https://github.com/apache/datafusion/issues/6876#issuecomment-2621668103 @DanCodedThis and @alamb I looked at [snowflake documentation](https://docs.snowflake.com/en/sql-reference/functions/dateadd), [Spark documentation](https://spark.apache.org/doc

Re: [PR] Make TypedString contain Value instead of String to support and preserve other quote styles [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
graup commented on code in PR #1679: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1679#discussion_r1932661673 ## tests/sqlparser_bigquery.rs: ## @@ -2214,6 +2214,30 @@ fn test_select_as_value() { assert_eq!(Some(ValueTableMode::AsValue), select.value_table_m

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
xudong963 commented on code in PR #14245: URL: https://github.com/apache/datafusion/pull/14245#discussion_r1933897333 ## datafusion/physical-optimizer/src/limit_pushdown.rs: ## @@ -248,7 +247,15 @@ pub fn pushdown_limit_helper( } } else { // Ad

Re: [PR] fix: pass scale to DF round in spark_round [datafusion-comet]

2025-01-29 Thread via GitHub
cht42 commented on code in PR #1341: URL: https://github.com/apache/datafusion-comet/pull/1341#discussion_r1933876335 ## native/spark-expr/src/math_funcs/round.rs: ## @@ -85,9 +85,10 @@ pub fn spark_round( let (precision, scale) = get_precision_scale(data_type);

Re: [PR] chore: Remove redundant processing from exprToProtoInternal [datafusion-comet]

2025-01-29 Thread via GitHub
codecov-commenter commented on PR #1351: URL: https://github.com/apache/datafusion-comet/pull/1351#issuecomment-2622069957 ## [Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1351?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_ca

Re: [PR] moving memory.rs out of datafusion/core [datafusion]

2025-01-29 Thread via GitHub
comphead commented on PR #14332: URL: https://github.com/apache/datafusion/pull/14332#issuecomment-2622072090 > if we go with that we should change name of every similar file in a similar manner (atleast `async.rs`). That is true, to be consistent it needs to be fixed for all files wh

Re: [PR] fix: fetch is missed during EnforceDistribution [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14207: URL: https://github.com/apache/datafusion/pull/14207#discussion_r1934089417 ## datafusion/physical-optimizer/src/enforce_distribution.rs: ## @@ -932,12 +932,16 @@ fn add_hash_on_top( /// # Arguments /// /// * `input`: Current node. +/// *

Re: [PR] moving memory.rs out of datafusion/core [datafusion]

2025-01-29 Thread via GitHub
comphead merged PR #14332: URL: https://github.com/apache/datafusion/pull/14332 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] Restore ability to run single SLT file [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14355: URL: https://github.com/apache/datafusion/pull/14355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Restore ability to run single SLT file [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14355: URL: https://github.com/apache/datafusion/pull/14355#issuecomment-2622078034 Thanks again @findepi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] refactor: switch `BooleanBufferBuilder` to `NullBufferBuilder` in binary_map [datafusion]

2025-01-29 Thread via GitHub
comphead merged PR #14341: URL: https://github.com/apache/datafusion/pull/14341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@dataf

Re: [PR] chore(deps): bump home from 0.5.9 to 0.5.11 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
alamb merged PR #14257: URL: https://github.com/apache/datafusion/pull/14257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] moving memory.rs out of datafusion/core [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14332: URL: https://github.com/apache/datafusion/pull/14332#issuecomment-2622083005 😍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
comphead commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934180368 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -605,6 +605,15 @@ object CometConf extends ShimCometConf { .booleanConf .create

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
comphead commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934182431 ## docs/source/user-guide/configs.md: ## @@ -64,6 +64,7 @@ Comet provides the following configuration settings. | spark.comet.explain.native.enabled | When t

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
comphead commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934181473 ## common/src/main/scala/org/apache/comet/CometConf.scala: ## @@ -605,6 +605,15 @@ object CometConf extends ShimCometConf { .booleanConf .create

Re: [PR] moving memory.rs out of datafusion/core [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14332: URL: https://github.com/apache/datafusion/pull/14332#discussion_r1934183938 ## datafusion/catalog/src/lib.rs: ## @@ -15,6 +15,16 @@ // specific language governing permissions and limitations // under the License. +//! Interfaces and defa

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
comphead commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934186350 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -929,6 +929,19 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde wi

Re: [PR] chore: Move all array_* serde to new framework, use correct INCOMPAT config [datafusion-comet]

2025-01-29 Thread via GitHub
comphead commented on code in PR #1349: URL: https://github.com/apache/datafusion-comet/pull/1349#discussion_r1934192543 ## spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: ## @@ -2371,83 +2384,17 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde

Re: [PR] feat: metadata columns [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2620931040 Yep from an initial impression I agree with you @jayzhan211 that does seem like it should work. Then you just need to hook into wherever select / projections are evaluated and make

Re: [PR] Fix incorrect searched CASE optimization [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14349: URL: https://github.com/apache/datafusion/pull/14349#discussion_r1933432132 ## datafusion/sqllogictest/test_files/case.slt: ## @@ -289,12 +289,22 @@ query B select case when a=1 then false end from foo; false -false -false -false -

Re: [PR] Improve speed of `median` by implementing special `GroupsAccumulator` [datafusion]

2025-01-29 Thread via GitHub
Rachelint commented on code in PR #13681: URL: https://github.com/apache/datafusion/pull/13681#discussion_r1933435620 ## datafusion/functions-aggregate/src/median.rs: ## @@ -230,6 +276,212 @@ impl Accumulator for MedianAccumulator { } } +/// The median groups accumulato

Re: [PR] Extend lambda support for ClickHouse, DuckDB and Generic dialects [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
gstvg commented on code in PR #1686: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1686#discussion_r1933397382 ## src/dialect/mod.rs: ## @@ -340,12 +340,21 @@ pub trait Dialect: Debug + Any { /// Returns true if the dialect supports lambda functions, for exam

Re: [PR] Improve speed of `median` by implementing special `GroupsAccumulator` [datafusion]

2025-01-29 Thread via GitHub
Rachelint commented on code in PR #13681: URL: https://github.com/apache/datafusion/pull/13681#discussion_r1933435620 ## datafusion/functions-aggregate/src/median.rs: ## @@ -230,6 +276,212 @@ impl Accumulator for MedianAccumulator { } } +/// The median groups accumulato

[PR] chore(deps): bump tempfile from 3.15.0 to 3.16.0 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] opened a new pull request, #14350: URL: https://github.com/apache/datafusion/pull/14350 Bumps [tempfile](https://github.com/Stebalien/tempfile) from 3.15.0 to 3.16.0. Changelog Sourced from https://github.com/Stebalien/tempfile/blob/master/CHANGELOG.md";>tempfile's

[PR] chore(deps): bump serde_json from 1.0.137 to 1.0.138 in /datafusion-cli [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] opened a new pull request, #14351: URL: https://github.com/apache/datafusion/pull/14351 Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.137 to 1.0.138. Release notes Sourced from https://github.com/serde-rs/json/releases";>serde_json's releases.

[I] Incorrect result for IS NOT NULL predicate over UNION ALL query [datafusion]

2025-01-29 Thread via GitHub
findepi opened a new issue, #14352: URL: https://github.com/apache/datafusion/issues/14352 ### Describe the bug Incorrect result for IS NOT NULL predicate over UNION ALL query ### To Reproduce ```sql SELECT a, a IS NOT NULL FROM ( SELECT 'foo' A

[PR] chore(deps): update getrandom requirement from 0.2.8 to 0.3.1 [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] opened a new pull request, #14353: URL: https://github.com/apache/datafusion/pull/14353 Updates the requirements on [getrandom](https://github.com/rust-random/getrandom) to permit the latest version. Changelog Sourced from https://github.com/rust-random/getrandom/b

Re: [PR] chore(deps): update getrandom requirement from 0.2.8 to 0.3.0 [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] closed pull request #14315: chore(deps): update getrandom requirement from 0.2.8 to 0.3.0 URL: https://github.com/apache/datafusion/pull/14315 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] chore(deps): update getrandom requirement from 0.2.8 to 0.3.0 [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] commented on PR #14315: URL: https://github.com/apache/datafusion/pull/14315#issuecomment-2621010404 Superseded by #14353. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] fix: FULL OUTER JOIN and LIMIT produces wrong results [datafusion]

2025-01-29 Thread via GitHub
zhuqi-lucas commented on PR #14338: URL: https://github.com/apache/datafusion/pull/14338#issuecomment-2621010452 Thank you @alamb and @xudong963! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] chore(deps): bump korandoru/hawkeye from 5 to 6 [datafusion]

2025-01-29 Thread via GitHub
dependabot[bot] opened a new pull request, #14354: URL: https://github.com/apache/datafusion/pull/14354 Bumps [korandoru/hawkeye](https://github.com/korandoru/hawkeye) from 5 to 6. Release notes Sourced from https://github.com/korandoru/hawkeye/releases";>korandoru/hawkeye's releas

Re: [I] Add support for lambda/higher order functions [datafusion]

2025-01-29 Thread via GitHub
gstvg commented on issue #14205: URL: https://github.com/apache/datafusion/issues/14205#issuecomment-2621052052 Hi @rkrishn7, have you already started any work? Since creating this issue, I’ve been working on a PoC to refine the proposal, and I managed to get it working. I’d like to c

Re: [I] Add support for lambda/higher order functions [datafusion]

2025-01-29 Thread via GitHub
gstvg commented on issue #14205: URL: https://github.com/apache/datafusion/issues/14205#issuecomment-2621070075 Lambda syntax only works with databricks dialect, proposed fix here https://github.com/apache/datafusion-sqlparser-rs/pull/1686 -- This is an automated message from the Apache G

Re: [PR] fix: LimitPushdown rule uncorrect remove some GlobalLimitExec [datafusion]

2025-01-29 Thread via GitHub
zhuqi-lucas commented on code in PR #14245: URL: https://github.com/apache/datafusion/pull/14245#discussion_r1933513054 ## datafusion/physical-optimizer/src/limit_pushdown.rs: ## @@ -247,7 +246,15 @@ pub fn pushdown_limit_helper( } } else { //

Re: [PR] Script and documentation for regenerating sqlite test files [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on code in PR #14290: URL: https://github.com/apache/datafusion/pull/14290#discussion_r1934713735 ## datafusion/sqllogictest/regenerate_sqlite_files.sh: ## @@ -0,0 +1,179 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or mo

Re: [I] Type Coercion fails for List with inner type struct which has large/view types [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14154: URL: https://github.com/apache/datafusion/issues/14154#issuecomment-2622957118 Possibly related: - https://github.com/apache/datafusion/pull/12490 (released in DataFusion 44.0.0) - https://github.com/apache/datafusion/pull/13452 (also released in Data

Re: [I] Type Coercion fails for List with inner type struct which has large/view types [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14154: URL: https://github.com/apache/datafusion/issues/14154#issuecomment-2622959604 Ao the next step for this PR is to find a DataFusion only reproducer that works in DF 43 but not in DF 44. I will try to do so tomorrow -- This is an automated message from the

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2622977789 as described in document, https://spark.apache.org/docs/3.5.1/api/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.html If a table column and a metadata c

Re: [PR] chore: Prepare for DataFusion 45 (bump to DataFusion rev 5592834 + Arrow 54.0.0) [datafusion-comet]

2025-01-29 Thread via GitHub
andygrove commented on code in PR #1332: URL: https://github.com/apache/datafusion-comet/pull/1332#discussion_r1934746178 ## native/core/src/execution/operators/filter.rs: ## @@ -62,6 +65,8 @@ pub struct FilterExec { default_selectivity: u8, /// Properties equivalence

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-01-29 Thread via GitHub
rohitrastogi commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1934929386 ## datafusion-examples/examples/thread_pools_lib/dedicated_executor.rs: ## @@ -0,0 +1,1778 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] Example for using a separate threadpool for CPU bound work (try 2) [datafusion]

2025-01-29 Thread via GitHub
rohitrastogi commented on code in PR #14286: URL: https://github.com/apache/datafusion/pull/14286#discussion_r1934929386 ## datafusion-examples/examples/thread_pools_lib/dedicated_executor.rs: ## @@ -0,0 +1,1778 @@ +// Licensed to the Apache Software Foundation (ASF) under one +

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
alamb commented on code in PR #14364: URL: https://github.com/apache/datafusion/pull/14364#discussion_r1934520683 ## datafusion/substrait/Cargo.toml: ## @@ -46,6 +46,7 @@ url = { workspace = true } [dev-dependencies] datafusion = { workspace = true, features = ["nested_expre

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
alamb commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622752446 I am checking this one out -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Build time regression [datafusion]

2025-01-29 Thread via GitHub
alamb commented on issue #14256: URL: https://github.com/apache/datafusion/issues/14256#issuecomment-2622826545 > > After removing the WildcardOptions (by replacing it with an empty structure) I can see the build time drops. Removing the rule itself and the change in core doesn't help. It l

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on code in PR #14364: URL: https://github.com/apache/datafusion/pull/14364#discussion_r1934571539 ## datafusion/substrait/Cargo.toml: ## @@ -46,6 +46,7 @@ url = { workspace = true } [dev-dependencies] datafusion = { workspace = true, features = ["nested

Re: [PR] move information_schema to datafusion-catalog [datafusion]

2025-01-29 Thread via GitHub
logan-keede commented on PR #14364: URL: https://github.com/apache/datafusion/pull/14364#issuecomment-2622850087 > If you are feeling like some more refactoring projects, any chance you are interested in working to split out the data sources (aka make `datafusion-datasource-parquet`, `dataf

Re: [I] Simple Functions [datafusion]

2025-01-29 Thread via GitHub
Omega359 commented on issue #12635: URL: https://github.com/apache/datafusion/issues/12635#issuecomment-2622912627 https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/string.rs is a good example of what can result with supporting many types and args -- This is

Re: [PR] Reduce size of `Expr` struct [datafusion]

2025-01-29 Thread via GitHub
findepi commented on code in PR #14366: URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934669162 ## datafusion/expr/src/expr.rs: ## @@ -3067,4 +3078,19 @@ mod test { rename: opt_rename, } } + +#[test] +fn test_size_of_expr()

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623096983 something like this. ```rust #[tokio::test] async fn test_name_conflict() { let batch = record_batch!( ("_rowid", UInt32, [0, 1, 2]), ("_ro

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
adriangb commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623148014 I will still give a shot at adding that feature tomorrow. But I'm not sold on the behavior being ideal even if that's what Spark does. Besides if it's an error now we can always mak

Re: [PR] feat: add expression array_size [datafusion-comet]

2025-01-29 Thread via GitHub
parthchandra commented on code in PR #1122: URL: https://github.com/apache/datafusion-comet/pull/1122#discussion_r1934793909 ## native/spark-expr/src/list.rs: ## @@ -708,6 +708,92 @@ impl PartialEq for ArrayInsert { } } +#[derive(Debug, Hash)] +pub struct ArraySize { +

Re: [PR] Support marking columns as system columns via Field's metadata [datafusion]

2025-01-29 Thread via GitHub
chenkovsky commented on PR #14362: URL: https://github.com/apache/datafusion/pull/14362#issuecomment-2623174593 > Is this that important to support? The example seems a bit contrived, I think it'd be more reasonable if it occurred naturally as part of a join or something where a user could

Re: [I] Implement xxhash algorithms as part of the expression API [datafusion]

2025-01-29 Thread via GitHub
Spaarsh commented on issue #14044: URL: https://github.com/apache/datafusion/issues/14044#issuecomment-2623464462 Thanks @HectorPascual! I'm opening PR from here on for transperancy's sake! -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[PR] 14044/enhancement/add xxhash algorithms in expression api [datafusion]

2025-01-29 Thread via GitHub
Spaarsh opened a new pull request, #14367: URL: https://github.com/apache/datafusion/pull/14367 ## Which issue does this PR close? Closes #14044. ## Rationale for this change Lack of xxhash (a quick, non-cryptographic hashing technique) functions. ## What c

Re: [PR] Allow plain JOIN without turning it into INNER [datafusion-sqlparser-rs]

2025-01-29 Thread via GitHub
iffyio merged PR #1692: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1692 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

  1   2   3   >