On Fri, Oct 14, 2022 at 06:04:06PM +0100, Andrew Stubbs wrote:
> This patch fixes a problem in which fatal errors inside mutex-locked regions
> (i.e. basically anything in the plugin) will cause it to hang up trying to
> take the lock to clean everything up.
> 
> Using abort() instead of exit(1) bypasses the atexit handlers and solves the
> problem.
> 
> OK for mainline?
> 
> Andrew

> libgomp: fix hang on fatal error
> 
> Don't try to clean up if a fatal error occurs in libgomp.  Typically the
> cleanup is not reentrant so we end up hung on a lock.
> 
> libgomp/ChangeLog:
> 
>       * error.c (gomp_vfatal): Use abort instead of exit.
> 
> diff --git a/libgomp/error.c b/libgomp/error.c
> index 50ed85eedb1..25548c14a82 100644
> --- a/libgomp/error.c
> +++ b/libgomp/error.c
> @@ -77,7 +77,7 @@ void
>  gomp_vfatal (const char *fmt, va_list list)
>  {
>    gomp_verror (fmt, list);
> -  exit (EXIT_FAILURE);
> +  abort ();
>  }
>  
>  void

I don't like this, abort has quite different user visible behavior
from exit, e.g. the former often dumps core.

I believe in most places libgomp handles this by releasing locks before
calling gomp_{,v}fatal:
      gomp_mutex_unlock (&register_lock);
      gomp_fatal ("Out of memory allocating %lu bytes", (unsigned long) size);

      gomp_mutex_unlock (&devicep->lock);
      gomp_fatal ("Copying of %s object [%p..%p) to %s object [%p..%p) failed",
                  src, srcaddr, srcaddr + size, dst, dstaddr, dstaddr + size);

etc.
I could live with a gomp_fatal/gomp_vfatal alternative that would
use _exit/_Exit (but not sure if it is supported on all targets where
libgomp is) for uses where releasing locks is for whatever reason
not an option.

        Jakub

Reply via email to