On Mon, 2022-06-20 at 11:44 -0400, Peter Xu wrote: > On Mon, Jun 20, 2022 at 11:23:53AM +0200, Juan Quintela wrote: > > Once discussed this, what I asked in the past is that you are having too > > much dirty memory on zero_copy. When you have a Multiterabyte guest, in > > a single round you have a "potentially" dirty memory on each channel of: > > > > total_amount_memory / number of channels. > > > > In a Multiterabyte guest, this is going to be more that probably in the > > dozens of gigabytes. As far as I know there is no card/driver that will > > benefit for so many pages in zero_copy, and kernel will move to > > synchronous copy at some point. (In older threads, daniel showed how to > > test for this case). > > I was wondering whether the kernel needs to cache a lot of messages for > zero copy if we don't flush it for a long time, as recvmsg(MSG_ERRQUEUE) > seems to be fetching one message from the kernel one at a time. And, > whether that queue has a limit in length or something.
IIRC, if all messages look the same, it 'merges' them in a single message, like, 'this range has these flags and output'. So, if no issue happens, we should have a single message with the confirmation of all sent buffers, meaning just a little memory is used for that. > > Does it mean that when the kernel could have cached enough of these > messages then it'll fallback to the no-zero-copy mode? And probably that's > the way how kernel protects itself from using too much buffer for the error > msgs? Since it merges the messages, I don't think it uses a lot of space for that. IIRC, the kernel will fall back to copying only if the network adapter / driver does not support MSG_ZEROCOPY, like when it does not support scatter-gather. > > This reminded me - Leo, have you considered adding the patch altogether to > detect the "fallback to non-zero-copy" condition? Because when with it and > when the fallback happens at some point (e.g. when the guest memory is > larger than some value) we'll know. I still did not consider that, but sure, how do you see that working? We can't just disable zero-copy-send because the user actually opted in, so we could instead add a one time error message for when it falls back to copying, as it should happen in the first try of zero-copy send. Or we could fail the migration, stating the interface does not support MSG_ZEROCOPY, since it should happen in the first sendmsg(). I would personally opt for the last option. What do you think? > > Thanks, > Thanks Peter! Best regards, Leo