rroelke opened a new issue, #13439: URL: https://github.com/apache/datafusion/issues/13439
The extent of the documentation for [`TableProvider::statistics`](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.statistics) in version 43.0.0 is: ``` Get statistics for this table, if available ``` This offers no explanation as to how the statistics will or will not be used. A user with experience in analytical database engines writing a custom `TableProvider` implementation may suspect that `TableProvider::statistics` is used by the datafusion query optimizer to determine join orders, perhaps among other things. However, this conclusion is apparently incorrect, which I deduce from the following pieces of evidence: 1) I am a user fitting that description and found that my custom `TableProvider::statistics` was not called in the presence of a join query 2) `cargo check --workspace --tests` runs with no errors if I remove the `fn statistics` declaration from the `trait TableProvider` definition 3) having found the source code for the rule which changes join orders it is clear that it calls `ExecutionPlan::statistics` instead. ### Expectation The documentation should set appropriate expectations for what `TableProvider::statistics` is used for, so that developers can make informed choices about whether or not to implement it. ### Additional context The apparent answer to what `TableProvider::statistics` is used for is "nothing" based on the `cargo check --workspace --tests` comment above, but removing the trait method is a breaking change. Based on the slack discussion prior to filing this issue, at least one user is depending on `TableProvider::statistics` for their custom optimizer rules and removing it would require them to find a workaround. Short of deprecating or removing the trait method, I would personally be satisfied just with updates to the method documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
