paleolimbot commented on issue #159: URL: https://github.com/apache/sedona-db/issues/159#issuecomment-3365792649
There are optimizer rules (will be both logical and physical, with at least one logical plan extension), custom exec plans, file formats (we replace the Parquet file format with one that supports GeoParquet), and soon there will be async UDFs. One of the challenges with the FFI approach is keeping up with DataFusion changes to those structures and/or version pinning...while I do think there's some long-term future where we can get there, in the short term the amount of work it takes to replicate some Python infrastructure is much less and requires far fewer people to agree. If datafusion-python would be willing to split the logical expression and non-logical expression portions of the package, we could probably find a way to leverage the non-execution portion of the package. That would depend on being able to serialize arbitrary expressions (I haven't played with that yet but I gather extension UDFs aren't quite there yet?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
