Re: [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX

Daniel P . Berrangé Tue, 23 Nov 2021 01:56:43 -0800

On Tue, Nov 23, 2021 at 01:46:44AM -0300, Leonardo Bras Soares Passos wrote:
> Hello Daniel,
> 
> On Fri, Nov 12, 2021 at 7:54 AM Daniel P. Berrangé <berra...@redhat.com> 
> wrote:
> [...]
> > > @@ -561,12 +577,15 @@ static ssize_t 
> > > qio_channel_socket_writev_flags(QIOChannel *ioc,
> > >   retry:
> > >      ret = sendmsg(sioc->fd, &msg, flags);
> > >      if (ret <= 0) {
> > > -        if (errno == EAGAIN) {
> > > +        switch (errno) {
> > > +        case EAGAIN:
> > >              return QIO_CHANNEL_ERR_BLOCK;
> > > -        }
> > > -        if (errno == EINTR) {
> > > +        case EINTR:
> > >              goto retry;
> > > +        case ENOBUFS:
> > > +            return QIO_CHANNEL_ERR_NOBUFS;
> >
> > Why does ENOBUFS need handling separately instead of letting
> > the error_setg_errno below handle it ?
> >
> > The caller immediately invokes error_setg_errno() again,
> > just with different error message.
> >
> > No code in this series ever looks at QIO_CHANNEL_ERR_NOBUFS
> > either, so we don't even need that special error return code
> > added AFAICT ?
> >
> 
> The idea was to add a custom message for ENOBUFS return when sending
> with MSG_ZEROCOPY.
> I mean, having this message is important for the user to understand
> why the migration is failing, but it would
> not make any sense to have this message while a non-zerocopy sendmsg()
> returns with ENOBUFS.
> 
> ENOBUFS : The output queue for a network interface was full.  This
> generally indicates that the interface has stopped sending, but may be
> caused by transient congestion.
> 
> As an alternative, I could add this message inside the switch, inside
> an if (flags & MSG_ZEROCOPY) on qio_channel_socket_writev_flags()
> instead of in it's caller.
> But for me it looks bloated, I mean, dealing with an error for
> ZEROCOPY only in the general function.


It is perfectly reasonable to check flags in this method.

> OTOH, if you think that it's a better idea to deal with every error in
> qio_channel_socket_writev_flags() instead of in the caller, I will
> change it for v6. Please let me know.

Yes, this method is already taking an ERror **errp parameter and
reporting a user facing error. If we need to report different
message text for ENOBUFS, it should be done in this method too.

The reason QIO_CHANNEL_ERR_BLOCK is special is because we are
explicitly not treating it as an error scenario at all.  That's
different to the ENOBUFS case.


> 
> > >          }
> > > +
> > >          error_setg_errno(errp, errno,
> > >                           "Unable to write to socket");
> > >          return -1;
> > > @@ -670,6 +689,127 @@ static ssize_t qio_channel_socket_writev(QIOChannel 
> > > *ioc,
> > >  }
> > >  #endif /* WIN32 */
> > >
> > > +
> > > +#ifdef CONFIG_LINUX
> > > +
> > > +static int qio_channel_socket_poll(QIOChannelSocket *sioc, bool zerocopy,
> > > +                                   Error **errp)
> >
> > There's only one caller and it always passes zerocopy=true,
> > so this parmeter looks pointless.
> 
> I did that for possible reuse of this function in the future:
> - As of today, this is certainly compiled out, but if at some point
> someone wants to use poll for something other
> than the reading of an zerocopy errqueue, it could be reused.
> 
> But sure, if that's not desirable, I can remove the parameter (and the
> if clause for !zerocopy).
> 
> >
> > > +{
> > > +    struct pollfd pfd;
> > > +    int ret;
> > > +
> > > +    pfd.fd = sioc->fd;
> > > +    pfd.events = 0;
> > > +
> > > + retry:
> > > +    ret = poll(&pfd, 1, -1);
> > > +    if (ret < 0) {
> > > +        switch (errno) {
> > > +        case EAGAIN:
> > > +        case EINTR:
> > > +            goto retry;
> > > +        default:
> > > +            error_setg_errno(errp, errno,
> > > +                             "Poll error");
> > > +            return ret;
> >
> >        return -1;
> >
> > > +        }
> > > +    }
> > > +
> > > +    if (pfd.revents & (POLLHUP | POLLNVAL)) {
> > > +        error_setg(errp, "Poll error: Invalid or disconnected fd");
> > > +        return -1;
> > > +    }
> > > +
> > > +    if (!zerocopy && (pfd.revents & POLLERR)) {
> > > +        error_setg(errp, "Poll error: Errors present in errqueue");
> > > +        return -1;
> > > +    }
> >
> > > +
> > > +    return ret;
> >
> >   return 0;
> 
> In the idea of future reuse I spoke above, returning zero here would
> make this function always look like the poll timed out. Some future
> users may want to repeat the waiting if poll() timed out, or if
> (return > 0) stop polling.

Now that I'm looking again, we should not really use poll() at all,
as GLib provides us higher level APIs. We in fact already have the
qio_channel_wait() method as a general purpose helper for waiting
for an I/O condition to occur.;


> I understand the idea of testing SO_EE_CODE_ZEROCOPY_COPIED to be able
> to tell whenever zerocopy fell back to copying for some reason, but I
> don't see how this can be helpful here.
> 
> Other than that I would do rv++ instead of rv=1 here, if I want to
> keep track of how many buffers were sent with zerocopy and how many
> ended up being copied.

Sure, we could do   "ret > 0 == number of buffers that were copied"
as the API contract, rather than just treating it as a boolean.



Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX

Reply via email to