paleolimbot commented on issue #159:
URL: https://github.com/apache/sedona-db/issues/159#issuecomment-3364129733

   We're using the FFI in a few ways! datafusion-python table providers can be 
used in `sd.create_data_frame()`:
   
   
https://github.com/apache/sedona-db/blob/a15844b4ff9c7e9416b8f1c1c07ad81d908a89cd/python/sedonadb/src/import_from.rs#L56-L64
   
   ...and we use the FFI definitions to allow functions to at least in theory 
be defined in separate Python packages (I'd hoped to use this for geography out 
of the gate but we just built everything together for the first release):
   
   
https://github.com/apache/sedona-db/blob/a15844b4ff9c7e9416b8f1c1c07ad81d908a89cd/rust/sedona/src/ffi.rs#L40-L58
   
   In terms of the Python interface, we can't just it because we have our own 
`SessionContext` in Rust land with our own optimizer rules and everything all 
assembled together. The FFI from DataFusion isn't stable yet and it can't 
serialize all the types of expressions we need it to (notably: the non 
datafusion UDFs, if I remember correctly). We also want to avoid having two 
copies of DataFusion installed (us and datafusion-python have quite a large 
installed size).
   
   I do really like their `Expr`, though:
   
   
https://github.com/apache/datafusion-python/blob/709c918ef810d7207f12c09b82c2e1b1c4ad8290/python/datafusion/expr.py#L342-L351
   
   ...and if we can find a way to (optionally) leverage that I'd love to!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to