zhangfengcdt opened a new pull request, #33:
URL: https://github.com/apache/sedona-db/pull/33
Implements native Rust versions of ST_Centroid and ST_Length functions using
the geo-generic-alg library, providing substantial performance improvements
over the existing GEOS-based implementations.
- **Added ST_Centroid implementation**
(`rust/sedona-geo/src/st_centroid.rs`)
- Native Rust implementation using geo-generic-alg
- Support for Point, LineString, Polygon, and GeometryCollection types
- Registered as alternative to GEOS implementation
- **Added ST_Length implementation** (`rust/sedona-geo/src/st_length.rs`)
- Native Rust implementation using geo-generic-alg
- Support for LineString, Polygon, and GeometryCollection types
- Comprehensive length calculation including polygon perimeters
- **Updated benchmark tests** (`benchmarks/test_functions.py`)
- Modified ST_Length tests to use segments_large, collections_simple,
and collections_complex tables
- Enhanced test coverage for performance validation
## ST_Centroid
Before the fix:
```
----------------------------------------------- benchmark
'table=polygons_complex': 2 tests
------------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
--------------------------------------------------------------------------------------------------------------------------------------------
test_st_centroid[polygons_complex-SedonaDB] 7.3047 (1.0) 7.3327
(1.0) 0.1581 (1.0) 7.0635 (1.0) 8.0731 (1.0)
test_st_centroid[polygons_complex-DuckDB] 22.6901 (3.11) 22.9047
(3.12) 0.5560 (3.52) 22.4189 (3.17) 24.6498 (3.05)
--------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------- benchmark
'table=polygons_simple': 2 tests ---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
---------------------------------------------------------------------------------------------------------------------------------------
test_st_centroid[polygons_simple-DuckDB] 1.4665 (1.0) 1.4766
(1.0) 0.0496 (1.0) 1.4512 (1.0) 2.1513 (1.0)
test_st_centroid[polygons_simple-SedonaDB] 2.3950 (1.63) 2.3920
(1.62) 0.0915 (1.84) 1.7250 (1.19) 2.7119 (1.26)
---------------------------------------------------------------------------------------------------------------------------------------
```
After the fix:
```
----------------------------------------------- benchmark
'table=polygons_complex': 2 tests
------------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
--------------------------------------------------------------------------------------------------------------------------------------------
test_st_centroid[polygons_complex-SedonaDB] 3.6323 (1.0) 3.9803
(1.0) 0.6333 (1.0) 3.4603 (1.0) 6.2812 (1.0)
test_st_centroid[polygons_complex-DuckDB] 24.0920 (6.63) 24.3458
(6.12) 0.6760 (1.07) 23.2327 (6.71) 25.6287 (4.08)
--------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------- benchmark
'table=polygons_simple': 2 tests ---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
---------------------------------------------------------------------------------------------------------------------------------------
test_st_centroid[polygons_simple-SedonaDB] 0.3798 (1.0) 0.3931
(1.0) 0.0368 (1.32) 0.3566 (1.0) 0.5098 (1.0)
test_st_centroid[polygons_simple-DuckDB] 1.4620 (3.85) 1.4719
(3.74) 0.0278 (1.0) 1.4469 (4.06) 1.7360 (3.41)
---------------------------------------------------------------------------------------------------------------------------------------
```
## ST_Length
Before the fix:
```
----------------------------------------------- benchmark
'table=collections_complex': 2 tests
----------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
---------------------------------------------------------------------------------------------------------------------------------------------
test_st_length[collections_complex-DuckDB] 5.8670 (1.0) 5.9233
(1.0) 0.2211 (1.0) 5.5197 (1.0) 6.6817 (1.0)
test_st_length[collections_complex-SedonaDB] 14.0494 (2.39) 14.4129
(2.43) 0.8906 (4.03) 13.7268 (2.49) 18.2037 (2.72)
---------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------- benchmark
'table=collections_simple': 2 tests
---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
-----------------------------------------------------------------------------------------------------------------------------------------
test_st_length[collections_simple-DuckDB] 0.7602 (1.0) 0.7618
(1.0) 0.0269 (1.0) 0.7120 (1.0) 1.1222 (1.0)
test_st_length[collections_simple-SedonaDB] 9.4402 (12.42) 9.7369
(12.78) 0.7997 (29.74) 9.1862 (12.90) 14.2978 (12.74)
-----------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------- benchmark
'table=segments_large': 2 tests ---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
------------------------------------------------------------------------------------------------------------------------------------
test_st_length[segments_large-DuckDB] 0.2080 (1.0) 0.2336 (1.0)
0.0436 (1.0) 0.1952 (1.0) 0.5038 (1.0)
test_st_length[segments_large-SedonaDB] 2.7922 (13.42) 2.8142 (12.05)
0.1098 (2.52) 2.6992 (13.83) 3.4997 (6.95)
------------------------------------------------------------------------------------------------------------------------------------
```
After the fix:
```
--------------------------------------------- benchmark
'table=collections_complex': 2 tests
---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
------------------------------------------------------------------------------------------------------------------------------------------
test_st_length[collections_complex-DuckDB] 5.8077 (1.0) 5.8841
(1.0) 0.3003 (1.0) 5.4656 (1.0) 6.8681 (1.0)
test_st_length[collections_complex-SedonaDB] 7.4916 (1.29) 7.6525
(1.30) 0.6201 (2.06) 7.2600 (1.33) 12.8403 (1.87)
------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------- benchmark
'table=collections_simple': 2 tests
---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
----------------------------------------------------------------------------------------------------------------------------------------
test_st_length[collections_simple-SedonaDB] 0.6839 (1.0) 0.6970
(1.0) 0.0463 (2.51) 0.6299 (1.0) 1.0530 (1.15)
test_st_length[collections_simple-DuckDB] 0.7205 (1.05) 0.7246
(1.04) 0.0185 (1.0) 0.7102 (1.13) 0.9124 (1.0)
----------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------- benchmark
'table=segments_large': 2 tests ---------------------------------------------
Name (time in ms) Median Mean
StdDev Min Max
------------------------------------------------------------------------------------------------------------------------------------
test_st_length[segments_large-DuckDB] 0.2613 (1.06) 0.2591 (1.04)
0.0232 (1.0) 0.1965 (1.0) 0.4800 (1.0)
test_st_length[segments_large-SedonaDB] 0.2463 (1.0) 0.2497 (1.0)
0.0292 (1.26) 0.2076 (1.06) 0.4960 (1.03)
------------------------------------------------------------------------------------------------------------------------------------
```
## Performance Results
The native Rust implementations provide significant performance benefits,
especially for simpler geometries, while maintaining full compatibility with
the existing API.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]