DerGut commented on issue #15675:
URL: https://github.com/apache/datafusion/issues/15675#issuecomment-2795408103

   ### What I've found so far:
   1. The amount of `sort_spill_reservation_bytes` is used to [reserve memory 
for the 
merge](https://github.com/apache/datafusion/blob/5ab5a03724b3afa009ba480a022145875972d08c/datafusion/physical-plan/src/sorts/sort.rs#L325)
   2. The [followup sort 
reservation](https://github.com/apache/datafusion/blob/5ab5a03724b3afa009ba480a022145875972d08c/datafusion/physical-plan/src/sorts/sort.rs#L328)
 for the next batch fails because no memory is available (according to the 
configured limit) 
   3. The ExternalSorter tries to free up memory by sorting any in-memory 
records and spilling them to disk, but there aren't any (most of 
[`ExternalSorter::sort_and_spill_in_mem_batches`](https://github.com/apache/datafusion/blob/5ab5a03724b3afa009ba480a022145875972d08c/datafusion/physical-plan/src/sorts/sort.rs#L531)
 becomes a noop)
   4. The [code 
errors](https://github.com/apache/datafusion/blob/5ab5a03724b3afa009ba480a022145875972d08c/datafusion/physical-plan/src/sorts/sort.rs#L453)
 when finishing the spill because it expects that _something_ was spilled to 
disk...
   
   ### What's happening on the happy path instead?
   I'm only looking at scenarios that spill to disk, as this seems to be where 
the ExternalSorter is tripping up.
   
   #### sort_spill_reservation_bytes > memory limit
   The ExternalSorter already tries to reserve more memory [for the 
merge](https://github.com/apache/datafusion/blob/5ab5a03724b3afa009ba480a022145875972d08c/datafusion/physical-plan/src/sorts/sort.rs#L325)
 than is available. It fails with
   ```
   Error: Resources exhausted: Failed to allocate additional 10485760 bytes for 
ExternalSorterMerge[0] with 0 bytes already allocated for this reservation - 
9437184 bytes remain available for the total pool
   ```
   (not exactly a happy path but we get an error that makes sense to the user)
   
   ### Other cases
   
   Playing around with this more, I've also found the internal error popping up 
in some combinations of  `sort_spill_reservation_bytes` < memory limit. I will 
need to look into this more tomorrow.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to