alamb commented on code in PR #16436:
URL: https://github.com/apache/datafusion/pull/16436#discussion_r2155551968
##########
datafusion/physical-plan/src/joins/symmetric_hash_join.rs:
##########
@@ -810,6 +810,21 @@ where
{
// Store the result in a tuple
let result = match (build_side, join_type) {
+ // For a mark join we “mark” each build‐side row with a dummy 0 in the
probe‐side index
+ // if it ever matched. For example, if
+ //
+ // prune_length = 5
+ // deleted_offset = 0
+ // visited_rows = {1, 3}
+ //
+ // then we produce:
+ //
+ // build_indices = [0, 1, 2, 3, 4]
+ // probe_indices = [None, Some(0), None, Some(0), None]
+ //
+ // Example: for each build row i in [0..5):
+ // – We always output its own index i in `build_indices`
+ // – We output `Some(0)` in `probe_indices[i]` if row i was ever
visited, else `None`
(JoinSide::Left, JoinType::LeftMark) => {
Review Comment:
Perhaps @metesynnada @ozankabak or @berkaysynnada could double check this
description is accurate
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]