Hi, CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined in cuda driver api version 6.0 and higher.
Currently nvptx_open_device uses a hard-coded constant instead. This patch fixes that by: - defining CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR to the hardcoded constant at toplevel, if not present in cuda.h, and - using CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR in nvptx_open_device Build on x86_64 with nvptx accelerator and reg-tested libgomp. Committed to trunk. Thanks, - Tom [libgomp, nvptx] Remove hard-coded const in nvptx_open_device 2018-08-08 Tom de Vries <tdevr...@suse.de> * plugin/plugin-nvptx.c (CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR): Define. (nvptx_open_device): Use CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR. --- libgomp/plugin/plugin-nvptx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index e2ea542049b0..589d6596cc2f 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -51,6 +51,7 @@ #if CUDA_VERSION < 6000 extern CUresult cuGetErrorString (CUresult, const char **); +#define CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR 82 #endif #define DO_PRAGMA(x) _Pragma (#x) @@ -741,9 +742,11 @@ nvptx_open_device (int n) &pi, CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK, dev); ptx_dev->regs_per_block = pi; - /* CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82 is defined only + /* CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined only in CUDA 6.0 and newer. */ - r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi, 82, dev); + r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, &pi, + CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR, + dev); /* Fallback: use limit of registers per block, which is usually equal. */ if (r == CUDA_ERROR_INVALID_VALUE) pi = ptx_dev->regs_per_block;