GitHub user Imbruced added a comment to the discussion: Observations from R and Python benchmarks: performance bottlenecks and optimization ideas for sedona-db
1. Did you call to_pandas only, or did you perform some operations on top of the result? The to_pandas method hardly relies on the GeoPandas from the arrow method. I assume that constructing shapely objects from wkb takes most of the time in this method. ```python GeoDataFrame.from_arrow ``` @Kontinuation, do you think there is a way to combine the C serde you wrote for Sedona and shapely conversions while ago? Do you think this even makes sense to do? I am not super familiar with Polars, but you mean to convert it to Polars or GeoPolars? I see that you did similar code to this. ```python table = df.to_arrow_table() polars_df = polars.from_arrow(table) ``` One thing that affected your benchmark is that, for SedondDB to pandas, you created shapely objects from the wkb in arrow, whereas for polars you just took the raw binary and did nothing with it. I am wondering if you could load it to geopolars maybe and do some operations later on it and with geopandas and compare the times? @Robinlovelace, by any chance, do you have some benchmarks on the SedonaDB UDFs? GitHub link: https://github.com/apache/sedona/discussions/2576#discussioncomment-15400494 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
