Kontinuation opened a new pull request, #523:
URL: https://github.com/apache/sedona-db/pull/523

   This PR addresses performance bottlenecks (stragglers) observed during the 
candidate refinement phase of SpatialBench Q10 and Q11, particularly at higher 
scale factors (SF=100 and SF=1000).
   
   When executing queries with large windows on dense datasets, a single R-Tree 
index query can retrieve millions of candidates. The probe partition becomes a 
"straggler" because it must sequentially evaluate spatial predicates for these 
millions of geometries. Since this bottleneck occurs within a single partition, 
DataFusion’s partition-level parallelism is unable to distribute the load.
   
   This patch introduced an async batch query interface for SpatialIndex. This 
allows the engine to split massive refinement workloads into smaller tasks, 
which are then executed in parallel by an async runtime. This amortizes 
scheduling costs of async function calls and eliminates the single-partition 
bottleneck.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to