DerGut commented on code in PR #15692:
URL: https://github.com/apache/datafusion/pull/15692#discussion_r2040700866
##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -765,6 +765,25 @@ impl ExternalSorter {
Ok(())
}
+
+ /// Reserves memory to be able to accommodate the given batch.
+ /// If memory is scarce, tries to spill current in-memory batches to disk
first.
+ async fn reserve_memory_for_batch(&mut self, input: &RecordBatch) ->
Result<()> {
+ let size = get_reserved_byte_for_record_batch(input);
+
+ let result = self.reservation.try_grow(size);
+ if result.is_err() {
+ if self.in_mem_batches.is_empty() {
+ return result;
Review Comment:
@2010YOUY01 since you commented on the issue
https://github.com/apache/datafusion/issues/15675#issuecomment-2796177594 with
suggestions:
> I think the fix (for a more user-friendly error message) would be
>
> 1. When sort_spill_reservation_byte reservation failed, the error message
should include considering set the memory limit larger than this reservation
memory size
> 2. For the code path triggered this internal error, return an
ExecutionError instead with a similar error message
This code now emits a
[`DataFusionError::ResourcesExhausted`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/common/src/error.rs#L126)
instead of the suggested
[`DataFusionError::Execution`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/common/src/error.rs#L117).
I think this makes sense, given the descriptions of the errors.
E.g. Execution error doesn't seem to apply:
> This error is returned when an error happens during execution due to a
malformed input.
but ResourcesExhausted does:
> This error is thrown when a consumer cannot acquire additional memory
---
I still like the idea of suggesting to bump the memory limit in the error
message.
However, since the code simply bubbles up the error emitted by the memory
pool's
[`insufficient_capacity_err`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/execution/src/memory_pool/pool.rs#L249)
macro, we would either need to map it, or change the macro's message.
I don't think mapping makes sense here because the error is generally used
for reservation-related issues - I don't think this is a special case.
Changing the error message would probably require a separate discussion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]