mbutrovich commented on PR #21962: URL: https://github.com/apache/datafusion/pull/21962#issuecomment-4390861669
Thanks for turning this around quickly, the force-grow version is easier to reason about than the conditional one. A few things I wanted to ask about before signing off: **grow() can push over the pool limit** `reservation.grow()` is infallible, so in the no-headroom case (which `spill_join_arrays_no_headroom` exercises) we'll end up with `pool.reserved() > memory_limit`. I think that's the right call since the memory is physically there, but it does mean the next operator to call `try_grow` sees a pool that's already over limit and fails. Could we add a one-line comment at the `grow()` site noting this is intentional? Future readers running into a "why is the pool over limit" question will appreciate the pointer. **Recomputing join_arrays memory** `BufferedBatch::new` already sums `get_array_memory_size()` over `join_arrays` as part of `size_estimation` (line 262-265). `join_arrays_mem()` walks the same arrays again on the spill path. Cheap, but would you be open to caching the sum as a field during construction? One less place for the two size calculations to drift if someone touches the estimation formula later. **try_shrink vs shrink on free** Given the invariant that we only shrink by what we grew, would `shrink()` read more symmetrically with the force-grow on the alloc side? The `?` on `try_shrink(reserved_amount)` is propagating a `Result` that shouldn't be reachable. Not a blocker, just noticed the asymmetry. **peak_mem assertion strength** The two new tests check `peak_mem > 0`, which catches the "never tracked at all" regression nicely. Would `peak_mem >= join_arrays_mem` be worth tightening to? That would also catch a future bug where we track something but not the right amount. Happy to leave as-is if you think the current form is enough. Perf-wise I looked at this against #20729 and I don't think the concerns there apply here. The extra `grow()` only fires on the spill branch, and spilling already pays for disk I/O, so one more pool call on a custom backend is in the noise. Non-spill workloads see the same pool-call count as before. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
