[PR] chore(rust/sedona-spatial-join): Split large join result batches into smaller ones [sedona-db]

via GitHub Sun, 18 Jan 2026 23:41:38 -0800


Kontinuation opened a new pull request, #525:
URL: https://github.com/apache/sedona-db/pull/525


   This is a follow up of https://github.com/apache/sedona-db/pull/523. When 
executing queries with large windows on dense datasets, each probe row may be 
matched with millions of indexed rows. If we don't break large result batches 
generated by such index probing, we'll easily overshoot the memory limit when 
assembling join result batches.
   
   This patch splits large joined build-probe side indices into smaller pieces 
and gradually assemble result batches. This will greatly reduce the amount of 
memory required for producing join results for "cover all" probe rows. The code 
for properly slicing join result indices for various join types is a bit 
complicated. We have added fuzz tests to verify that it works correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] chore(rust/sedona-spatial-join): Split large join result batches into smaller ones [sedona-db]

Reply via email to