Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

Cesar Philippidis Thu, 21 Jun 2018 06:59:14 -0700

On 06/20/2018 03:15 PM, Tom de Vries wrote:
> On 06/20/2018 11:59 PM, Cesar Philippidis wrote:
>> Now it follows the formula contained in
>> the "CUDA Occupancy Calculator" spreadsheet that's distributed with CUDA.
> 
> Any reason we're not using the cuda runtime functions to get the
> occupancy (see PR85590 - [nvptx, libgomp, openacc] Use cuda runtime fns
> to determine launch configuration in nvptx ) ?


There are two reasons:

  1) cuda_occupancy.h depends on the CUDA runtime to extract the device
     properties instead of the CUDA driver API. However, we can always
     teach libgomp how to populate the cudaDeviceProp struct using the
     driver API.

  2) CUDA is not always present on the build host, and that's why
     libgomp maintains its own cuda.h. So at the very least, this
     functionality would be good to have in libgomp as a fallback
     implementation; its not good to have program fail due to
     insufficient hardware resources errors when it is avoidable.

Cesar

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

Reply via email to