[PR] Improve perf of CRS [sedona-db]

via GitHub Fri, 12 Dec 2025 12:15:42 -0800


jesspav opened a new pull request, #446:
URL: https://github.com/apache/sedona-db/pull/446


   In flamegraphs for PR #430 revealed that the CRS serialization had some 
opportunities for improvement that would be essential for raster/geo per item 
CRS functions.  This was going to be especially important for CRS comparisons 
in joins.
   
   Added a simple benchmarks for lnglat and auth code equality that was run 
repeatedly with different performance optimizations.  At the start it took 61ms 
for auth codes and 37ms for lnglat.  By the end auth codes have improved by 
nearly 20x and lnglat by over 10x, with both running in 3.4ms.
   
   There are basically two changes: an LRU cache and avoiding unnecessary 
string allocations.
   
   Breakdown of performance changes:
   ```
   Starting point:
   
   
        Running benches/crs_benchmarks.rs 
(target/release/deps/crs_benchmarks-c62a15af9667b709)
   equality_lnglat_crs     time:   [37.442 ms 37.712 ms 37.960 ms]
                           change: [+2.5665% +4.0155% +5.4781%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 15 outliers among 100 measurements (15.00%)
     6 (6.00%) low severe
     9 (9.00%) low mild
   
   Benchmarking equality_different_crs: Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 6.3s, or reduce sample count to 70.
   equality_different_crs  time:   [61.892 ms 62.119 ms 62.362 ms]
                           change: [+2.1679% +2.6956% +3.2470%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
     5 (5.00%) high severe
   
   
   After cache:
   equality_lnglat_crs     time:   [28.265 ms 28.361 ms 28.461 ms]
                           change: [−25.338% −24.795% −24.189%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   equality_different_crs  time:   [27.473 ms 27.530 ms 27.586 ms]
                           change: [−55.878% −55.683% −55.494%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   
   
   
   Time after cache + switch value to str +  + colon rather than split
   
   equality_lnglat_crs     time:   [11.324 ms 11.354 ms 11.385 ms]
                           change: [−60.148% −59.967% −59.798%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
   
   equality_different_crs  time:   [11.655 ms 11.701 ms 11.746 ms]
                           change: [−57.674% −57.497% −57.308%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   
   
   Time after cache + switch value to str + colon rather than split + pre-alloc 
instead of format for eq:
   Benchmarking equality_lnglat_crs: Collecting 100 samples in estimated 5.4946 
s (800 iterat
   equality_lnglat_crs     time:   [6.6983 ms 6.7258 ms 6.7557 ms]
                           change: [+0.7120% +1.2645% +1.8552%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) high mild
     4 (4.00%) high severe
   
   Benchmarking equality_different_crs: Collecting 100 samples in estimated 
5.4518 s (800 ite
   equality_different_crs  time:   [6.7116 ms 6.7429 ms 6.7758 ms]
                           change: [+0.7410% +1.5266% +2.2864%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 4 outliers among 100 measurements (4.00%)
     4 (4.00%) high mild
   
   
   All of the above +  not splitting up the auth code:
   equality_lnglat_crs     time:   [3.3993 ms 3.4098 ms 3.4200 ms]
                           change: [−49.633% −49.425% −49.229%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) low mild
   
   equality_different_crs  time:   [3.4080 ms 3.4170 ms 3.4258 ms]
                           change: [−49.183% −48.996% −48.819%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Improve perf of CRS [sedona-db]

Reply via email to