kosiew commented on code in PR #21962:
URL: https://github.com/apache/datafusion/pull/21962#discussion_r3193919876
##########
datafusion/physical-plan/src/joins/sort_merge_join/materializing_stream.rs:
##########
@@ -978,6 +996,20 @@ impl MaterializingSortMergeJoinStream {
.unwrap(); // Operation only return None if no
batches are spilled, here we ensure that at least one batch is spilled
buffered_batch.batch =
BufferedBatchState::Spilled(spill_file);
+
+ // Track remaining in-memory data (join key arrays)
that
+ // stay in memory even after the batch is spilled.
This is
+ // much smaller than the full batch, so try_grow should
+ // usually succeed. If it fails, reserved_amount stays
0 -
+ // best-effort tracking, free_reservation will safely
be a no-op.
+ let join_arrays_mem = buffered_batch.join_arrays_mem();
+ if self.reservation.try_grow(join_arrays_mem).is_ok() {
Review Comment:
I think the spill path can still leave retained join key arrays invisible to
the memory pool.
Right now, if the full batch `try_grow(size_estimation)` fails because the
pool is full, and the follow-up `try_grow(join_arrays_mem)` also fails, we
spill the IPC batch but still push `buffered_batch` with `reserved_amount = 0`.
At that point the operator is still holding the retained `join_arrays`, but
the pool is no longer aware of them when making spill decisions for other
operators. This seems like the same invariant violation we were trying to avoid.
I think this can still happen with concurrent reservations or when the
memory limit is below a single join-array allocation, and in those cases many
skewed spilled batches could accumulate untracked memory.
Can we make the retained in-memory portion accounted deterministically here?
For example, by growing or resizing the reservation after the physical memory
is retained, or by returning an error instead of continuing untracked.
It would also be great to add a regression test that covers the no-headroom
path where `try_grow(join_arrays_mem)` fails, since the current test only
exercises the successful reservation case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]