alamb commented on code in PR #13560:
URL: https://github.com/apache/datafusion/pull/13560#discussion_r1870587465


##########
datafusion/core/src/physical_optimizer/sort_pushdown.rs:
##########
@@ -606,6 +611,102 @@ fn handle_custom_pushdown(
     }
 }
 
+// For hash join we only maintain the input order for the right child
+// for join type: Inner, Right, RightSemi, RightAnti
+fn handle_hash_join(

Review Comment:
   I think it would be great eventually to get this kind of operator specific 
logic into the operators (e.g. some method in HashJoinExec). Definitely not in 
this PR, but having assumptions about the operator separate from its 
implementation gives us a larger chance of introducing inconsistencies I think



##########
datafusion/sqllogictest/test_files/joins.slt:
##########
@@ -3160,10 +3160,10 @@ explain SELECT t1_id, t1_name, t1_int FROM 
right_semi_anti_join_table_t2 t2 RIGH
 ----
 physical_plan
 01)SortPreservingMergeExec: [t1_id@0 ASC NULLS LAST]
-02)--SortExec: expr=[t1_id@0 ASC NULLS LAST], preserve_partitioning=[true]
-03)----CoalesceBatchesExec: target_batch_size=2
-04)------HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(t2_id@0, 
t1_id@0)], filter=t2_name@0 != t1_name@1
-05)--------MemoryExec: partitions=1, partition_sizes=[1]
+02)--CoalesceBatchesExec: target_batch_size=2

Review Comment:
   These plans imply that the sort has been pushed into the second (probe) 
input which makes sense I think : 
https://docs.rs/datafusion/latest/datafusion/physical_plan/joins/struct.HashJoinExec.html
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to