Re: [PR] bug: Fix memory reservation and allocation problems for SortExec [datafusion]

via GitHub Fri, 14 Feb 2025 03:21:54 -0800


Kontinuation commented on PR #14644:
URL: https://github.com/apache/datafusion/pull/14644#issuecomment-2659040926

> I can get the high-level idea of the more conservative memory accounting
in point 3, it is used to also account for merged batches, but I get lost in
the memory estimation details in the implementation (especially is this 2X
estimation amplification only for merged batches, or also intermediate `Row`s),
could you explain with a concrete example how everything is calculated? (for
example, there is a `ExternalSorter` with 100M memory budget, and it will
consume 10 batches, each with 20M size, how memory estimation is calculated in
each step)

The 2X amplification is mainly for intermediate `Row`s, not for merged
batches.

Let's assume that each batch is 10 MB, and we have 100 MB memory budget. The
following diagram shows how the memory consumption become 100MB when performing
merging.

![datafusion-sort-merge
drawio](https://github.com/user-attachments/assets/aeb3faad-1d27-4fb2-b475-55859b4883e9)

Here is the detailed explanation:

1. We reserve 2X memory for each batch on insertion, so when
`in_mem_batches` holds 5 batches and consumes 50 MB memory, we have already
reserved 100 MB memory. The next insertion will trigger a merge, and possibly a
spill.
2. When merge happens, each batch in `in_mem_batches` was sorted
individually using `sort_batch` and fed into `StreamingMergeBuilder` to build a
`SortPreservingMergeStream`. The batches were taken away from `in_mem_batches`,
and the original batches will be dropped immediately after retrieving a sorted
batch. We assume that the sorted batches has the same size as the original
batches.
3. `SortPreservingMergeStream` polls one batch from each sorted stream,
create a [row
representation](https://github.com/apache/datafusion/blob/45.0.0/datafusion/physical-plan/src/sorts/stream.rs#L118-L132)
or [sorted array
representation](https://github.com/apache/datafusion/blob/45.0.0/datafusion/physical-plan/src/sorts/stream.rs#L183-L188)
for each batch. The sorted batches were saved into `in_progress` and the
row/sorted array representation were saved into `cursors`. We assume that the
row representation or sorted array has the same size as the sorted batch. Now
we have consumed 100MB.
4. `SortPreservingMergeStream` produces merged batches. We can assume that
the overall memory consumption remains unchanged during this process, and
certainly we need to reserve memory for merged batches. Each time we poll a
merged batche from `SortPreservingMergeStream`, we try reserving memory for it.
If the reservation fails, all future merged batches polled from the merged
stream will be directly written to the spill file.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] bug: Fix memory reservation and allocation problems for SortExec [datafusion]

Reply via email to