alamb commented on PR #13736: URL: https://github.com/apache/datafusion/pull/13736#issuecomment-2558186777
I have been thinking a lot about this PR and I don't want to let it die because we are stuck in trying to figure out a broader staistics question. I would like to find an incremental way forward. Here is my proposal: We target this feature (support sum statistics) for inclusion in DataFusion 45 (aka we plan to merge this / part of it after DataFuion 44 is out). That will give us time to rework / finagle the APIs without having to make brekaing changes in back to back releases (hopefully) > Do you consider this to be blocking for this PR? Or is expanding the size of ColumnStatistics acceptable in the short-term? I recommend: 1. Add the new sum statistics in one PR 2. Look into optimizing the size (maybe wrapping the Statistics with `Arc` or something) as a second PR(s) 3. Figure out how to integrate `PhysicalExpr::column_statistics` as a third PR (I think it would make sense after we had the ColumnStatistics in Arc) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org