[Libguestfs] [PATCH libnbd v3] lib/errors.c: Fix assert fail in exit path in multi-threaded code

Richard W.M. Jones Thu, 09 Mar 2023 00:45:04 -0800

When a highly multi-threaded program such as nbdcopy encounters an
error, there is a race condition in the library which can cause an
assertion failure and thus a core dump:


(1) An error occurs on one of the threads.  nbdcopy calls exit(3).

(2) In lib/errors.c, the destructor calls pthread_key_delete.

(3) Another thread which is still running also encounters an error,
and inside libnbd the library calls set_error().

(4) The call to set_error() calls pthread_getspecific which returns
NULL (since the key has already been destroyed in step (2)), and this
causes us to call pthread_setspecific which returns EINVAL because
glibc is able to detect invalid use of a pthread_key_t after it has
been destroyed.  In any case, the error message is lost, and any
subsequent call to nbd_get_error() will return NULL.

(5) We enter the %DEAD state, where there is an assertion that
nbd_get_error() is not NULL.  This assertion is supposed to be for
checking that the code called set_error() before entering the %DEAD
state.  Although we did call set_error(), the error message was lost
at step (4) and nbd_get_error() will always return NULL.  This
assertion failure causes a crash.

There aren't any good ways to fix this.  I chose to leak the
pthread_key_t on the exit path.
---
 lib/errors.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/errors.c b/lib/errors.c
index 8b77650ef3..6fbfaacd34 100644
--- a/lib/errors.c
+++ b/lib/errors.c
@@ -69,7 +69,11 @@ errors_key_destroy (void)
     free (last_error->error);
     free (last_error);
   }
-  pthread_key_delete (errors_key);
+
+  /* We could do this, but that causes a race condition described here:
+   * https://listman.redhat.com/archives/libguestfs/2023-March/031002.html
+   */
+  //pthread_key_delete (errors_key);
 }
 
 /* This is called when a thread exits, to free the thread-local data
-- 
2.39.2

_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

[Libguestfs] [PATCH libnbd v3] lib/errors.c: Fix assert fail in exit path in multi-threaded code

Reply via email to