On Tue, Dec 01, 2015 at 08:29:27PM +0300, Ilya Verbin wrote: > libgomp/ > * target.c (finalized): New static variable. > (resolve_device): Do nothing when finalized is true. > (GOMP_offload_register_ver): Likewise. > (GOMP_offload_unregister_ver): Likewise. > (gomp_target_fini): New static function. > (gomp_target_init): Call gomp_target_fini at exit. > liboffloadmic/ > * plugin/libgomp-plugin-intelmic.cpp (unregister_main_image): Remove. > (register_main_image): Do not call unregister_main_image at exit. > (GOMP_OFFLOAD_fini_device): Allow for OpenMP. Unregister main image. > > diff --git a/libgomp/target.c b/libgomp/target.c > index cf9d0e6..320178e 100644 > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -78,6 +78,10 @@ static int num_devices; > /* Number of GOMP_OFFLOAD_CAP_OPENMP_400 devices. */ > static int num_devices_openmp; > > +/* True when offloading runtime is finalized. */ > +static bool finalized;
> + > + > /* Similar to gomp_realloc, but release register_lock before gomp_fatal. */ > > static void * > @@ -108,6 +112,9 @@ gomp_get_num_devices (void) > static struct gomp_device_descr * > resolve_device (int device_id) > { > + if (finalized) > + return NULL; > + This is racy, tsan would tell you so. Instead of a global var, I'd just change the devicep->is_initialized field from bool into a 3 state field (perhaps enum), with states uninitialized, initialized, finalized, and then say in resolve_device, gomp_mutex_lock (&devices[device_id].lock); if (devices[device_id].state == GOMP_DEVICE_UNINITIALIZED) gomp_init_device (&devices[device_id]); else if (devices[device_id].state == GOMP_DEVICE_FINALIZED) { gomp_mutex_unlock (&devices[device_id].lock); return NULL; } gomp_mutex_unlock (&devices[device_id].lock); Though, of course, that is incomplete, because resolve_device takes one lock, gomp_get_target_fn_addr another one, gomp_map_vars yet another one. So I think either we want to rewrite the locking, such that say resolve_device returns a locked device and then you perform stuff on the locked device (disadvantage is that gomp_map_vars will call gomp_malloc with the lock held, which can take some time to allocate the memory), or there needs to be the possibility that gomp_map_vars rechecks if the device has not been finalized after taking the lock and returns to the caller if the device has been finalized in between resolve_device and gomp_map_vars. Jakub