timsaucer commented on code in PR #1016:
URL: 
https://github.com/apache/datafusion-python/pull/1016#discussion_r1957178603


##########
src/dataframe.rs:
##########
@@ -50,6 +52,22 @@ use crate::{
     expr::{sort_expr::PySortExpr, PyExpr},
 };
 
+#[pyclass(name = "TableProvider", module = "datafusion")]
+pub struct PyTableProvider {
+    provider: Arc<dyn TableProvider>,
+}
+
+impl PyTableProvider {
+    pub fn new(provider: Arc<dyn TableProvider>) -> Self {
+        Self { provider }
+    }
+
+    pub fn as_table(&self) -> PyTable {
+        let table_provider: Arc<dyn TableProvider> = self.provider.clone();
+        PyTable::new(table_provider)
+    }
+}
+

Review Comment:
   In general I think this is a good idea, but I'm worried about causing 
confusion with a table provider created from a view and a table provider that 
is passed from an external source using pycapsule. I can imagine a user would 
think that a table provider object from one place can be used with another. 
That is, if I create a table provider with into_view I should be able to 
register it with the session context. Now, I don't think that operation is 
strictly necssary but I do expect it would cause some confusion.
   
   What I think we want to do is to have a single common PyTableProvider that 
can be created either via a pycapsule or into_view.



##########
src/dataframe.rs:
##########
@@ -156,6 +174,20 @@ impl PyDataFrame {
         PyArrowType(self.df.schema().into())
     }
 
+    /// Convert this DataFrame into a Table that can be used in register_table
+    fn _into_view(&self) -> PyDataFusionResult<PyTable> {

Review Comment:
   I recommend disabling that specific clippy warning.



##########
README.md:
##########
@@ -81,6 +81,49 @@ This produces the following chart:
 
 ![Chart](examples/chart.png)
 
+## Registering a DataFrame as a View
+
+You can use the `into_view` method to convert a DataFrame into a view and 
register it with the context.
+
+```python
+from datafusion import SessionContext, col, literal
+

Review Comment:
   I think this is very good, but would be more helpful if moved into the 
appropriate docs section so it goes into the online documentation rather than 
the readme.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to