Rachelint commented on PR #11943:
URL: https://github.com/apache/datafusion/pull/11943#issuecomment-2287897129
@2010YOUY01 make sense, it seems `emit_early_if_necessary` function is
actually introduced in the spilling pr #7400.
I am checking the related codes about memory control, too.
But the `FirstBlocks` is just the blocked version `First`, because it is too
expansive to emit exact first n groups in blocked impls.
For example:
- `block size = 4`, `blocks=2`, and emit `first 3 groups`,
- After emitting, `groups in first block = 1`, `groups in second block = 4`
- And for keep the logic correct, we need to move groups to fill the first
blocks to make:
`groups in first block = 4`, `groups in second block = 1`
Actually we can remove the `FirstBlocks` and `AllBlocks` mode, and we impl
the related logics just out of `GroupValues` and `GroupAccumultor`.
For example, in `emit_early_if_necessary`, we check if the `GroupValues` and
`GroupAccumultor` are in blocked modes first. If so, we get the `block_size`,
do the aligned work, and finally pass the aligned results in `First(aligned)`
to them
But I think if we impl like this, it will be so confused, and for making it
clear, I introduce the two new blocked emission mode `FirstBlocks` and
`AllBlocks`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]