Leonardo Brás <leob...@redhat.com> wrote:
> On Mon, 2022-06-20 at 11:44 -0400, Peter Xu wrote:
>> On Mon, Jun 20, 2022 at 11:23:53AM +0200, Juan Quintela wrote:
>> > Once discussed this, what I asked in the past is that you are having too
>> > much dirty memory on zero_copy.  When you have a Multiterabyte guest, in
>> > a single round you have a "potentially" dirty memory on each channel of:
>> > 
>> >    total_amount_memory / number of channels.
>> > 
>> > In a Multiterabyte guest, this is going to be more that probably in the
>> > dozens of gigabytes.  As far as I know there is no card/driver that will
>> > benefit for so many pages in zero_copy, and kernel will move to
>> > synchronous copy at some point.  (In older threads, daniel showed how to
>> > test for this case).
>> 
>> I was wondering whether the kernel needs to cache a lot of messages for
>> zero copy if we don't flush it for a long time, as recvmsg(MSG_ERRQUEUE)
>> seems to be fetching one message from the kernel one at a time.  And,
>> whether that queue has a limit in length or something.
>
> IIRC, if all messages look the same, it 'merges' them in a single message, 
> like,
> 'this range has these flags and output'.
>
> So, if no issue happens, we should have a single message with the confirmation
> of all sent buffers, meaning just a little memory is used for that.
>
>> 
>> Does it mean that when the kernel could have cached enough of these
>> messages then it'll fallback to the no-zero-copy mode?  And probably that's
>> the way how kernel protects itself from using too much buffer for the error
>> msgs?
>
> Since it merges the messages, I don't think it uses a lot of space for that.
>
> IIRC, the kernel will fall back to copying only if the network adapter / 
> driver
> does not support MSG_ZEROCOPY, like when it does not support scatter-gather.

My understanding is that it will fallback when you have too much stuff
inflight.

>> 
>> This reminded me - Leo, have you considered adding the patch altogether to
>> detect the "fallback to non-zero-copy" condition?  Because when with it and
>> when the fallback happens at some point (e.g. when the guest memory is
>> larger than some value) we'll know.
>
> I still did not consider that, but sure, how do you see that working?

send with zero_copy(1MB)
send with zero_copy(1MB)
.... (repeat)
at some point kernel decides:
sync all queue()
send synchronously next package.

we are not wondering if the kernel does this (it does).  What we are
wondering is when it does it, i.e. after 1MB worth of writes, 2MB, 10MB
....
That is the thing that depends on kernel/network card/driver.


> We can't just disable zero-copy-send because the user actually opted in, so we
> could instead add a one time error message for when it falls back to copying, 
> as
> it should happen in the first try of zero-copy send.

On your 1st (or second) series, Dan Berrange explained hew to use the
error message interface to detect it.

> Or we could fail the migration, stating the interface does not support
> MSG_ZEROCOPY, since it should happen in the first sendmsg().

> I would personally opt for the last option.
>
> What do you think?

Later, Juan.


Reply via email to