Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2620036711 And I broke the build 🤦 . Fix PR: - https://github.com/apache/datafusion/pull/14345 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2619977704 WFT let's do it and keep things moving -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
alamb merged PR #14074: URL: https://github.com/apache/datafusion/pull/14074 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusi

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2619977143 > Any other blockers @alamb ? Thanks for hustling this through I am somewhat overwhelmed with - https://github.com/apache/datafusion/issues/14008 (and also - https://gi

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
ozankabak commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2619923329 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-28 Thread via GitHub
gatesn commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2619891822 Any other blockers @alamb ? Thanks for hustling this through -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2611215362 I merged this branch up from main and triggered the CI again. If there are no additional concerns I hope to merge this in a day or two -- This is an automated message from the Apache

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2609448004 > Statistics can be helpful for optimizer rules, but they also allow short-circuiting computations. For example, min/max can be used to avoid evaluating a filter over a record

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
gatesn commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2609426600 Statistics can be helpful for optimizer rules, but they also allow short-circuiting computations. For example, min/max can be used to avoid evaluating a filter over a record batch and

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2609417451 > I can't think of any other statistical quantities that would immediately help operators, so from our perspective it's only "sum" (we may also use sum to mean true-count for b

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
gatesn commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2609166991 I can't think of any other statistical quantities that would immediately help operators, so from our perspective it's only "sum" (we may also use sum to mean true-count for booleans).

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-23 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2609145891 > @berkaysynnada can we merge this PR in now? Or shall we wait for the statistics revamp that is underway? No need to wait for underway PR as it does not depend which sta

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-22 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2608482804 @berkaysynnada can we merge this PR in now? Or shall we wait for the statistics revamp that is underway? -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-15 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2594728046 > Looks like I got hit by some new ColumnStatistics tests on main. Should be fixed now 🤞 > > @berkaysynnada can you expand on the rationale for the V2 stats? I understan

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-15 Thread via GitHub
gatesn commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2592991088 Looks like I got hit by some new ColumnStatistics tests on main. Should be fixed now 🤞 @berkaysynnada can you expand on the rationale for the V2 stats? I understand that it's

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-13 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2588553063 > > > We've started to refactor. The design is complete, and the implementation is in progress. > > > > > > Thanks! Is there anywhere I can follow along @berkaysynnada (I am

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-13 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2588148550 > > We've started to refactor. The design is complete, and the implementation is in progress. > > Thanks! Is there anywhere I can follow along @berkaysynnada (I am parti

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-13 Thread via GitHub
alamb commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2588096257 > We've started to refactor. The design is complete, and the implementation is in progress. Thanks! Is there anywhere I can follow along @berkaysynnada (I am particularly inter

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-12 Thread via GitHub
berkaysynnada commented on PR #14074: URL: https://github.com/apache/datafusion/pull/14074#issuecomment-2585899506 > FYI @suremarc @berkaysynnada / @ozankabak as this changes statistics and I think you are already working on things related to that: We've started to refactor. The desig

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-12 Thread via GitHub
gatesn commented on code in PR #14074: URL: https://github.com/apache/datafusion/pull/14074#discussion_r1912472076 ## datafusion/common/src/stats.rs: ## @@ -170,24 +170,63 @@ impl Precision { pub fn add(&self, other: &Precision) -> Precision { match (self, other)

Re: [PR] Add `ColumnStatistics::Sum` [datafusion]

2025-01-12 Thread via GitHub
gatesn commented on code in PR #14074: URL: https://github.com/apache/datafusion/pull/14074#discussion_r1912472076 ## datafusion/common/src/stats.rs: ## @@ -170,24 +170,63 @@ impl Precision { pub fn add(&self, other: &Precision) -> Precision { match (self, other)