alamb commented on PR #15503:
URL: https://github.com/apache/datafusion/pull/15503#issuecomment-2787539880
I think the current API in this PR (returning `PartitionedStatistics`) is
nice from the point of view that it provides the finest granularity access to
statistics (provides per partition information that can be summarized when
desired).
I do agree it is unfortunate that `ExecutionPlan::statistics` and
`ExecutionPlan::partitioned_statistics` may end up being redundant.
However, I think changing the signature of `ExecutionPlan::statistics` will
just be that much more disruptive.
So what I personally suggest is add `ExecutionPlan::partitioned_statistics`
as in this PR, but work to reduce the duplication. I think we could add a
default implementation of `statistics` that just called through to
`partitioned-statistics` which would avoid the duplication. Something like
```rust
pub trait ExecutionPlan {
...
fn statistics(&self) -> Result<Statistics> {
self.partitioned_statistics().combine()
}
...
````
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]