xuchen-plus commented on issue #12136: URL: https://github.com/apache/datafusion/issues/12136#issuecomment-2609230218
We have also encountered this issue. After some debugging (by adding debug logs before every call to `try_grow`), it seems that the memory counting of batches returned by `SortPreserveMerge` is wrong. I set `sort_spill_reservation_bytes` to 8MB, and `memory_limit` to 256MB. After insert some batches, the pool is full and start to in-memory sort, at this moment the pool usage is: `GreedyMemoryPool { pool_size: 268435456, used: 267618380 }`. And the sort and collect at https://github.com/apache/datafusion/blob/0228bee29aa97abc354b06f3278e709d752d92b3/datafusion/physical-plan/src/sorts/sort.rs#L440-L443 finished without error. After sort, all memory are released. Then the following code count the sorted batches' memory: https://github.com/apache/datafusion/blob/0228bee29aa97abc354b06f3278e709d752d92b3/datafusion/physical-plan/src/sorts/sort.rs#L445-L449 And this size is: 698840964, which exceeds the memory limit and results to: ``` Error: ResourcesExhausted("Additional allocation failed with top memory consumers (across reservations) as: ExternalSorterMerge[0] consumed 33554432 bytes, ExternalSorterMerge[18] consumed 0 bytes, ExternalSorter[7] consumed 0 bytes, ExternalSorter[14] consumed 0 bytes, ExternalSorter[1] consumed 0 bytes. Error: Failed to allocate additional 698840964 bytes for ExternalSorter[0] with 0 bytes already allocated for this reservation - 234881024 bytes remain available for the total pool") ``` Not sure why the sorted batches' memory is over 2.6x than the batches before sort. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org