xuchen-plus commented on issue #12136:
URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2609230218

   We have also encountered this issue. After some debugging (by adding debug 
logs before every call to `try_grow`), it seems that the memory counting of 
batches returned by `SortPreserveMerge` is wrong.
   
   I set `sort_spill_reservation_bytes` to 8MB, and `memory_limit` to 256MB.
   
   After insert some batches, the pool is full and start to in-memory sort, at 
this moment the pool usage is: `GreedyMemoryPool { pool_size: 268435456, used: 
267618380 }`.
   And the sort and collect at 
   
https://github.com/apache/datafusion/blob/0228bee29aa97abc354b06f3278e709d752d92b3/datafusion/physical-plan/src/sorts/sort.rs#L440-L443
   finished without error. After sort, all memory are released.
   
   
   
   Then the following code count the sorted batches' memory:
   
https://github.com/apache/datafusion/blob/0228bee29aa97abc354b06f3278e709d752d92b3/datafusion/physical-plan/src/sorts/sort.rs#L445-L449
   And this size is: 698840964, which exceeds the memory limit and results to:
   ```
   Error: ResourcesExhausted("Additional allocation failed with top memory 
consumers (across reservations) as: ExternalSorterMerge[0] consumed 33554432 
bytes, ExternalSorterMerge[18] consumed 0 bytes, ExternalSorter[7] consumed 0 
bytes, ExternalSorter[14] consumed 0 bytes, ExternalSorter[1] consumed 0 bytes. 
Error: Failed to allocate additional 698840964 bytes for ExternalSorter[0] with 
0 bytes already allocated for this reservation - 234881024 bytes remain 
available for the total pool")
   ```
   
   Not sure why the sorted batches' memory is over 2.6x than the batches before 
sort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to