On Mon, Apr 14, 2025 at 09:51:32AM +0800, Yong Huang wrote:
> On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé <[email protected]>
> wrote:
>
> > On Tue, Apr 08, 2025 at 10:27:51AM +0800, [email protected] wrote:
> > > From: Hyman Huang <[email protected]>
> > >
> > > As advised by the GNU TLS, the caller should attempt again
> > > if the gnutls_record_{recv,send} return EAGAIN or EINTR;
> > > check the following link to view the details:
> > >
> > https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
> > >
> > > Add the retry parameter for virNetTLSSession{Read,Write}
> > > functions in accordance with this guideline.
> > >
> > > This prevents the upper application from encountering the
> > > following error message when it calls the virConnectOpenAuth API:
> > > Unable to read TLS confirmation: Resource temporarily unavailable
> > >
> > > Signed-off-by: Hyman Huang <[email protected]>
> > > ---
> > > src/rpc/virnetclient.c | 2 +-
> > > src/rpc/virnetsocket.c | 4 ++--
> > > src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++--
> > > src/rpc/virnettlscontext.h | 6 ++++--
> > > 4 files changed, 33 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
> > > index 92933220e2..5340db4211 100644
> > > --- a/src/rpc/virnetclient.c
> > > +++ b/src/rpc/virnetclient.c
> > > @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client,
> > > ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL));
> > > #endif /* !WIN32 */
> > >
> > > - len = virNetTLSSessionRead(client->tls, buf, 1);
> > > + len = virNetTLSSessionRead(client->tls, buf, 1, true);
> > > if (len < 0 && errno != ENOMSG) {
> > > virReportSystemError(errno, "%s",
> > > _("Unable to read TLS confirmation"));
> > > diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
> > > index e8fc2d5f7d..6774dd4a4b 100644
> > > --- a/src/rpc/virnetsocket.c
> > > +++ b/src/rpc/virnetsocket.c
> > > @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket
> > *sock, char *buf, size_t len)
> > > if (sock->tlsSession &&
> > > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
> > > VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> > > - ret = virNetTLSSessionRead(sock->tlsSession, buf, len);
> > > + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false);
> > > } else {
> > > ret = read(sock->fd, buf, len);
> > > }
> > > @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket
> > *sock, const char *buf, size_t
> > > if (sock->tlsSession &&
> > > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) ==
> > > VIR_NET_TLS_HANDSHAKE_COMPLETE) {
> > > - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len);
> > > + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false);
> > > } else {
> > > ret = write(sock->fd, buf, len); /* sc_avoid_write */
> > > }
> > > diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c
> > > index e8023133b4..92d909c0b7 100644
> > > --- a/src/rpc/virnettlscontext.c
> > > +++ b/src/rpc/virnettlscontext.c
> > > @@ -679,16 +679,28 @@ void
> > virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess,
> > >
> > >
> > > ssize_t virNetTLSSessionWrite(virNetTLSSession *sess,
> > > - const char *buf, size_t len)
> > > + const char *buf, size_t len,
> > > + bool retry)
> > > {
> > > ssize_t ret;
> > >
> > > + rewrite:
> > > virObjectLock(sess);
> > > ret = gnutls_record_send(sess->session, buf, len);
> > >
> > > if (ret >= 0)
> > > goto cleanup;
> > >
> > > + if (retry && (ret == GNUTLS_E_AGAIN || ret ==
> > GNUTLS_E_INTERRUPTED)) {
> > > + /*
> > > + * GNU TLS advises calling the function again to obtain the
> > data if EAGAIN is returned.
> > > + * See reference:
> > https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html
> > > + * */
> >
> > To repeat my feedback on v1.
> >
> > This would surely be wrong for E_AGAIN - if the socket is blocking we must
> > return to the main loop and wait for it to indicate that the socket has
> > new data. If we don't wait in the main loop, this code will busy-loop on
>
> a non-blocking socket
>
>
> Agree, for the EAGAIN error, trying to repoll could be a solution like the
> following pieces of code?
>
> repoll:
> source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock),
> G_IO_IN,
> client->eventCtx,
> virNetClientIOEventTLSConfirm,
> client, NULL);
>
> g_main_loop_run(client->eventLoop);
>
> retry:
> len = virNetTLSSessionRead(client->tls, buf, 1, true);
> if (len < 0 && errno == EINTR) {
> goto retry;
> }
> if (len < 0 && errno == EAGAIN) {
> goto repoll;
> }
Yes, and virNetSocketReadWire is already handlnig both EINTR
and EAGAIN after it calls virNetTLSSessionRead.
The other caller is virNetClientSetTLSSession and I see that
is failed to handle EINTR/EGAIN, though EGAIN seems like it
ought to be unlikely given that caller waited for G_IO_IN...
> > A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the
> > socket to be readable. This is something higher level libvirt
> > code should already be doing correctly. Do you have any real
> > world example of a problem being hit ?
> >
snip
> Yes, when our higher management app try to connect Libvirtd using
> "/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line
> 148, in openAuth raise libvirtError('virConnectOpenAuth() failed')
> libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily
> unavailable
....this error message comes from virNetClientSetTLSSession
and suggests the wait for G_IO_IN wasn't sufficient, so we
should retry I guess
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|