jonathanc-n commented on code in PR #20478:
URL: https://github.com/apache/datafusion/pull/20478#discussion_r2839310905
##########
datafusion/physical-plan/src/joins/sort_merge_join/stream.rs:
##########
@@ -153,17 +156,15 @@ impl StreamedBatch {
idx: 0,
join_arrays: vec![],
output_indices: vec![],
+ num_output_rows: 0,
buffered_batch_idx: None,
join_filter_matched_idxs: HashSet::new(),
}
}
/// Number of unfrozen output pairs in this streamed batch
fn num_output_rows(&self) -> usize {
Review Comment:
nit: i think this function can be removed, we can call
`num_unfrozen_pairs()` as `streamed_batch.num_output_rows`. and add a small
comment for the `num_output_rows` field to declare that it represents unfrozen
pairs
##########
datafusion/physical-plan/src/joins/sort_merge_join/stream.rs:
##########
@@ -1437,6 +1433,7 @@ impl SortMergeJoinStream {
}
self.streamed_batch.output_indices.clear();
+ self.streamed_batch.num_output_rows = 0;
Review Comment:
I think we should probably encapsulate this reset pattern into its own
function? (self.reset() calls .clear() and sets num_output_rows = 0)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]