Kontinuation opened a new issue, #282:
URL: https://github.com/apache/sedona-db/issues/282

   The 
[`compute_properties`](https://github.com/apache/sedona-db/blob/9cecb5b42e7ed2e5c234d61c24ea286ba3d58245/rust/sedona-spatial-join/src/exec.rs#L230-L248)
 and other output data properties such as 
[`maintains_input_order`](https://github.com/apache/sedona-db/blob/9cecb5b42e7ed2e5c234d61c24ea286ba3d58245/rust/sedona-spatial-join/src/exec.rs#L203-L223)
 of SpatialJoinExec may be incorrect.
   
   * `maintains_input_order` says that the order of the right side is 
maintained when running an inner/right/right-anti/right-semi join, which is 
incorrect for KNN join, since the left side might be the probe side.
   * `compute_properties` may not be correct when the SpatialJoinExec was 
converted from a HashJoin. It states that the [output partitioning of the left 
side could be 
maintained](https://github.com/apache/sedona-db/blob/9cecb5b42e7ed2e5c234d61c24ea286ba3d58245/rust/sedona-spatial-join/src/exec.rs#L274-L278),
 but this is not the case.
   
   We can change the compute_properties and maintains_input_order to be more 
conservative, but that will break the data distribution and ordering 
requirements of the downstream physical operators and fail the [query plan 
sanity 
check](https://github.com/apache/sedona-db/blob/9cecb5b42e7ed2e5c234d61c24ea286ba3d58245/rust/sedona-spatial-join/src/optimizer.rs#L562),
 unless we run enforce distribution and enforce sorting rules again after 
swapping NestedLoopJoinExec or HashJoinExec.
   
   A better way is to inject our custom physical planning rules and emit 
SpatialJoinExec directly for logical join operator in the planning phase. We 
chose to mess around with physical plan optimizations rather than extending the 
planner because DataFusion does not have a good API for plugging in custom 
planner. We'll take another look into this issue and see how can we do this 
cleanly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to