On 2/23/23 21:40, Eric Blake wrote:
> libnbd's copy/copy-nbd-error.sh was triggering an assertion failure in
> nbdkit:
> 
> $ nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M 
> error-pread-rate=0.5 ] null:
> ...
> nbdkit: pattern.2: debug: error-inject: pread count=262144 offset=4718592
> nbdkit: pattern.2: debug: pattern: pread count=262144 offset=4718592
> nbdkit: pattern.1: debug: error-inject: pread count=262144 offset=4456448
> nbdkit: pattern.1: error: injecting EIO error into pread
> nbdkit: pattern.1: debug: sending error reply: Input/output error
> nbdkit: pattern.3: debug: error-inject: pread count=262144 offset=4980736
> nbdkit: pattern.3: debug: pattern: pread count=262144 offset=4980736
> nbdkit: pattern.4: error: write data: NBD_CMD_READ: Broken pipe
> nbdkit: pattern.4: debug: exiting worker thread pattern.4
> nbdkit: connections.c:402: raw_send_socket: Assertion `sock >= 0' failed.
> 
> When there are multiple queued requests, and the client hangs up
> abruptly as soon as the first error response is seen (rather than
> waiting until all other queued responses are flushed then sending
> NBD_CMD_DISC), another thread that has a queued response ready to send
> sees EPIPE (the client is no longer reading) and closes the socket,
> then the third thread triggers the assertion failure because it not
> expecting to attempt a send to a closed socket.  Thus, my claim that
> sock would always be non-negative after collecting data from the
> plugin was wrong.  The fix is to gracefully handle an abrupt client
> disconnect, by not attempting to send on an already-closed socket.
> 
> For the nbdcopy example, --exit-with-parent means it's just an
> annoyance (nbdcopy has already exited, and no further client will be
> connecting); but for a longer-running nbdkit server accepting parallel
> clients, it means any one client can trigger the SIGABRT by
> intentionally queueing multiple NBD_CMD_READ then disconnecting early,
> and thereby kill nbdkit and starve other clients.  Whether it rises to
> the level of CVE depends on whether you consider one client being able
> to starve others a privilege escalation (if you are not using TLS,
> there are other ways for a bad client to starve peers; if you are
> using TLS, then the starved client has the same credentials as the
> client that caused the SIGABRT so there is no privilege boundary
> escalation).
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2173054
> Fixes: daef505e ("server: Give client EOF when we are done writing", v1.32.4)
> ---
>  server/connections.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/server/connections.c b/server/connections.c
> index 4d776f2a..1b63eb73 100644
> --- a/server/connections.c
> +++ b/server/connections.c
> @@ -1,5 +1,5 @@
>  /* nbdkit
> - * Copyright (C) 2013-2022 Red Hat Inc.
> + * Copyright (C) 2013-2023 Red Hat Inc.
>   *
>   * Redistribution and use in source and binary forms, with or without
>   * modification, are permitted provided that the following conditions are
> @@ -400,7 +400,10 @@ raw_send_socket (const void *vbuf, size_t len, int flags)
>    ssize_t r;
>    int f = 0;
> 
> -  assert (sock >= 0);
> +  if (sock < 0) {
> +    errno = EBADF;
> +    return -1;
> +  }
>  #ifdef MSG_MORE
>    if (flags & SEND_MORE)
>      f |= MSG_MORE;

Is it possible that, after the second thread sees EPIPE and closes the
socket, a new client connects, the original file descriptor is
(effectively) repurposed, and the third thread now attempts to send on a
valid, but *wrong* socket?

I don't know if "multiple clients for a single nbdkit server" is a
thing, but I believe that at least "multiconn" could lead to the same
phenomenon.

If this problem exists, then I guess the socket abstraction (?) would
have to be extended with a refcount, under the protection of a mutex.
Whenever sending fails (and note that EPIPE is a persistent error, so
once one thread sees EPIPE, the others will keep seeing that too!), the
refcount is decreased, and the one thread that reaches refcount 0 closes
the socket.

This is just brainstorming, could be totally disconnected from nbdkit's
current behavior.

Laszlo
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to