On 07/26/2018 04:27 PM, Cesar Philippidis wrote: > Hi Tom, > > I see that you're reviewing the libgomp changes. Please disregard the > following hunk: > > On 07/11/2018 12:13 PM, Cesar Philippidis wrote: >> @@ -1199,12 +1202,59 @@ nvptx_exec (void (*fn), size_t mapnum, void >> **hostaddrs, void **devaddrs, >> default_dims[GOMP_DIM_VECTOR]); >> } >> pthread_mutex_unlock (&ptx_dev_lock); >> + int vectors = default_dims[GOMP_DIM_VECTOR]; >> + int workers = default_dims[GOMP_DIM_WORKER]; >> + int gangs = default_dims[GOMP_DIM_GANG]; >> + >> + if (nvptx_thread()->ptx_dev->driver_version > 6050) >> + { >> + int grids, blocks; >> + CUDA_CALL_ASSERT (cuOccupancyMaxPotentialBlockSize, &grids, >> + &blocks, function, NULL, 0, >> + dims[GOMP_DIM_WORKER] * dims[GOMP_DIM_VECTOR]); >> + GOMP_PLUGIN_debug (0, "cuOccupancyMaxPotentialBlockSize: " >> + "grid = %d, block = %d\n", grids, blocks); >> + >> + gangs = grids * dev_size; >> + workers = blocks / vectors; >> + } > > I revisited this change yesterday and I noticed it was setting gangs > incorrectly. Basically, gangs should be set as follows > > gangs = grids * (blocks / warp_size); > > or to be more closer to og8 as > > gangs = 2 * grids * (blocks / warp_size); > > The use of that magic constant 2 is to prevent thread starvation. That's > a similar concept behind make -j<2*#threads>. > > Anyway, I'm still experimenting with that change. There are still some > discrepancies between the way that I select num_workers and how the > driver does. The driver appears to be a little bit more conservative, > but according to the thread occupancy calculator, that should yield > greater performance on GPUs. > > I just wanted to give you a heads up because you seem to be working on this. >
Ack, thanks for letting me know. > Thanks for all of your reviews! > > By the way, are you now maintainer of the libgomp nvptx plugin? I'm not sure if that's a separate thing. AFAIU the responsibilities of the nvptx maintainer are: - the nvptx backend (under supervision of the global maintainers) - and anything nvptx-y in all other components (under supervision of the component and global maintainers) So, I'd say I'm on the hook to review patches for the nvptx plugin in libgomp. Thanks, - Tom