DerGut commented on code in PR #15692:
URL: https://github.com/apache/datafusion/pull/15692#discussion_r2040700866


##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -765,6 +765,25 @@ impl ExternalSorter {
 
         Ok(())
     }
+
+    /// Reserves memory to be able to accommodate the given batch.
+    /// If memory is scarce, tries to spill current in-memory batches to disk 
first.
+    async fn reserve_memory_for_batch(&mut self, input: &RecordBatch) -> 
Result<()> {
+        let size = get_reserved_byte_for_record_batch(input);
+
+        let result = self.reservation.try_grow(size);
+        if result.is_err() {
+            if self.in_mem_batches.is_empty() {
+                return result;

Review Comment:
   @2010YOUY01 since you commented on the issue 
https://github.com/apache/datafusion/issues/15675#issuecomment-2796177594 with 
suggestions:
   
   > I think the fix (for a more user-friendly error message) would be
   >
   > 1. When sort_spill_reservation_byte reservation failed, the error message 
should include considering set the memory limit larger than this reservation 
memory size
   > 2. For the code path triggered this internal error, return an 
ExecutionError instead with a similar error message
   
   This code now emits a 
[`DataFusionError::ResourcesExhausted`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/common/src/error.rs#L126)
 instead of the suggested 
[`DataFusionError::Execution`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/common/src/error.rs#L117).
 I think this makes sense, given the descriptions of the errors.
   
   E.g. Execution error doesn't seem to apply:
   > This error is returned when an error happens during execution due to a 
malformed input.
   
   but ResourcesExhausted does:
   > This error is thrown when a consumer cannot acquire additional memory
   
   ---
   
   I still like the idea of suggesting to bump the memory limit in the error 
message.
   
   However, since the code simply bubbles up the error emitted by the memory 
pool's 
[`insufficient_capacity_err`](https://github.com/apache/datafusion/blob/0b01fdf7f02f9097c319204058576f420b9790b4/datafusion/execution/src/memory_pool/pool.rs#L249)
 macro, we would either need to map it, or change the macro's message.
   
   I don't think mapping makes sense here because the error is generally used 
for reservation-related issues - I don't think this is a special case.
   Changing the error message would probably require a separate discussion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to