timsaucer commented on code in PR #1016: URL: https://github.com/apache/datafusion-python/pull/1016#discussion_r1957178603
########## src/dataframe.rs: ########## @@ -50,6 +52,22 @@ use crate::{ expr::{sort_expr::PySortExpr, PyExpr}, }; +#[pyclass(name = "TableProvider", module = "datafusion")] +pub struct PyTableProvider { + provider: Arc<dyn TableProvider>, +} + +impl PyTableProvider { + pub fn new(provider: Arc<dyn TableProvider>) -> Self { + Self { provider } + } + + pub fn as_table(&self) -> PyTable { + let table_provider: Arc<dyn TableProvider> = self.provider.clone(); + PyTable::new(table_provider) + } +} + Review Comment: In general I think this is a good idea, but I'm worried about causing confusion with a table provider created from a view and a table provider that is passed from an external source using pycapsule. I can imagine a user would think that a table provider object from one place can be used with another. That is, if I create a table provider with into_view I should be able to register it with the session context. Now, I don't think that operation is strictly necssary but I do expect it would cause some confusion. What I think we want to do is to have a single common PyTableProvider that can be created either via a pycapsule or into_view. ########## src/dataframe.rs: ########## @@ -156,6 +174,20 @@ impl PyDataFrame { PyArrowType(self.df.schema().into()) } + /// Convert this DataFrame into a Table that can be used in register_table + fn _into_view(&self) -> PyDataFusionResult<PyTable> { Review Comment: I recommend disabling that specific clippy warning. ########## README.md: ########## @@ -81,6 +81,49 @@ This produces the following chart:  +## Registering a DataFrame as a View + +You can use the `into_view` method to convert a DataFrame into a view and register it with the context. + +```python +from datafusion import SessionContext, col, literal + Review Comment: I think this is very good, but would be more helpful if moved into the appropriate docs section so it goes into the online documentation rather than the readme. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org