On Fri, Mar 21, 2025 at 11:33:31AM +0530, Manish wrote: > Hi Daniel, Peter, > > Please let me know if this latest patch looks good? > > > On 17/03/25 7:22 am, Manish Mishra wrote: > > We allocate extra metadata SKBs in case of a zerocopy send. This metadata > > memory is accounted for in the OPTMEM limit. If there is any error while > > sending zerocopy packets or if zerocopy is skipped, these metadata SKBs are > > queued in the socket error queue. This error queue is freed when userspace > > reads it. > > > > Usually, if there are continuous failures, we merge the metadata into a > > single > > SKB and free another one. As a result, it never exceeds the OPTMEM limit. > > However, if there is any out-of-order processing or intermittent zerocopy > > failures, this error chain can grow significantly, exhausting the OPTMEM > > limit. > > As a result, all new sendmsg requests fail to allocate any new SKB, leading > > to > > an ENOBUF error. Depending on the amount of data queued before the flush > > (i.e., large live migration iterations), even large OPTMEM limits are prone > > to > > failure. > > > > To work around this, if we encounter an ENOBUF error with a zerocopy > > sendmsg, > > we flush the error queue and retry once more. > > > > V2: > > 1. Removed the dirty_sync_missed_zero_copy migration stat. > > 2. Made the call to qio_channel_socket_flush_internal() from > > qio_channel_socket_writev() non-blocking. > > > > V3: > > 1. Add the dirty_sync_missed_zero_copy migration stat again. > > > > Signed-off-by: Manish Mishra <manish.mis...@nutanix.com>
I have an old comment which could still apply here: https://lore.kernel.org/all/Z885hS6QmGOZYj7N@x1.local/ That's on s/zero_copy_flush_pending/zerocopy_flush_once/. But no need to repost only for that.. that's more or less a nitpick. It's unfortunate we need to keep the ABI and the complexity even if the counter almost means nothing solid.. The change overall looks good here. Reviewed-by: Peter Xu <pet...@redhat.com> Thanks, -- Peter Xu