On Tue, Dec 01, 2015 at 08:29:27PM +0300, Ilya Verbin wrote:
> libgomp/
>       * target.c (finalized): New static variable.
>       (resolve_device): Do nothing when finalized is true.
>       (GOMP_offload_register_ver): Likewise.
>       (GOMP_offload_unregister_ver): Likewise.
>       (gomp_target_fini): New static function.
>       (gomp_target_init): Call gomp_target_fini at exit.
> liboffloadmic/
>       * plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove.
>       (register_main_image): Do not call unregister_main_image at exit.
>       (GOMP_OFFLOAD_fini_device): Allow for OpenMP.  Unregister main image.
> 
> diff --git a/libgomp/target.c b/libgomp/target.c
> index cf9d0e6..320178e 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -78,6 +78,10 @@ static int num_devices;
>  /* Number of GOMP_OFFLOAD_CAP_OPENMP_400 devices.  */
>  static int num_devices_openmp;
>  
> +/* True when offloading runtime is finalized.  */
> +static bool finalized;


> +
> +
>  /* Similar to gomp_realloc, but release register_lock before gomp_fatal.  */
>  
>  static void *
> @@ -108,6 +112,9 @@ gomp_get_num_devices (void)
>  static struct gomp_device_descr *
>  resolve_device (int device_id)
>  {
> +  if (finalized)
> +    return NULL;
> +

This is racy, tsan would tell you so.
Instead of a global var, I'd just change the devicep->is_initialized 
field from bool into a 3 state field (perhaps enum), with states
uninitialized, initialized, finalized, and then say in resolve_device,

  gomp_mutex_lock (&devices[device_id].lock);
  if (devices[device_id].state == GOMP_DEVICE_UNINITIALIZED)
    gomp_init_device (&devices[device_id]);
  else if (devices[device_id].state == GOMP_DEVICE_FINALIZED)
    {
      gomp_mutex_unlock (&devices[device_id].lock);
      return NULL;
    }
  gomp_mutex_unlock (&devices[device_id].lock);

Though, of course, that is incomplete, because resolve_device takes one
lock, gomp_get_target_fn_addr another one, gomp_map_vars yet another one.
So I think either we want to rewrite the locking, such that say
resolve_device returns a locked device and then you perform stuff on the
locked device (disadvantage is that gomp_map_vars will call gomp_malloc
with the lock held, which can take some time to allocate the memory),
or there needs to be the possibility that gomp_map_vars rechecks if the
device has not been finalized after taking the lock and returns to the
caller if the device has been finalized in between resolve_device and
gomp_map_vars.

        Jakub

Reply via email to