Hi,

I have successfully built openmpi-3.0.0 from source with cuda 8.0.61.2 and
7.5.18 on CentOS-7 x86_64 (default system gnu compilers).
I am trying to build openmpi-3.0.0 with cuda9 on CentOS-7 and failed
with cuda9 with this error:

make[2]: Leaving directory 
`/c7/home/tru/build/openmpi-3.0.0/build-cuda-9.0.176_384.81/opal/mca/shmem/sysv'
Making all in tools/wrappers
make[2]: Entering directory 
`/c7/home/tru/build/openmpi-3.0.0/build-cuda-9.0.176_384.81/opal/tools/wrappers'
  CCLD     opal_wrapper
../../../opal/.libs/libopen-pal.so: undefined reference to 
`nvmlDeviceGetPciInfo_v3'
collect2: error: ld returned 1 exit status
make[2]: *** [opal_wrapper] Error 1
make[2]: Leaving directory 
`/c7/home/tru/build/openmpi-3.0.0/build-cuda-9.0.176_384.81/opal/tools/wrappers'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory 
`/c7/home/tru/build/openmpi-3.0.0/build-cuda-9.0.176_384.81/opal'
make: *** [all-recursive] Error 1

<Additionnal informations (failing builder)>
[tru@manolito build-cuda-9.0.176_384.81]$ grep -r nvmlDeviceGetPciInfo_v3 
$CUDA_INSTALL_PATH
Binary file /c7/shared/cuda/9.0.176_384.81/lib64/stubs/libnvidia-ml.so matches
/c7/shared/cuda/9.0.176_384.81/include/nvml.h:#define nvmlDeviceGetPciInfo      
  nvmlDeviceGetPciInfo_v3

The desktop has a legacy card and the supporting driver does not support the 
cuda9,
but I would not expect that would cause such an error, but maybe?

[tru@manolito build-cuda-9.0.176_384.81]$ nvidia-smi
Wed Oct 11 08:42:33 2017
+------------------------------------------------------+
| NVIDIA-SMI 340.102    Driver Version: 340.102        |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 8600 GT     Off  | 0000:01:00.0     N/A |                  N/A |
|  0%   72C    P0    N/A /  N/A |      3MiB /   511MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

[tru@manolito build-cuda-9.0.176_384.81]$ deviceQuery
deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL
[tru@manolito build-cuda-9.0.176_384.81]$ deviceQueryDrv 
deviceQueryDrv Starting...

CUDA Device Query (Driver API) statically linked version 
Detected 1 CUDA Capable device(s)

Device 0: "GeForce 8600 GT"
  CUDA Driver Version:                           6.5
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 511 MBytes (536150016 bytes)
MapSMtoCores for SM 1.1 is undefined.  Default to use 64 Cores/SM
MapSMtoCores for SM 1.1 is undefined.  Default to use 64 Cores/SM
  ( 4) Multiprocessors, ( 64) CUDA Cores/MP:     256 CUDA Cores
  GPU Max Clock rate:                            1188 MHz (1.19 GHz)
  Memory Clock rate:                             700 Mhz
  Memory Bus Width:                              128-bit
  Max Texture Dimension Sizes                    1D=(8192) 2D=(65536, 32768) 
3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(8192), 512 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(8192, 8192), 512 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  768
  Maximum number of threads per block:           512
  Max dimension size of a thread block (x,y,z): (512, 512, 64)
  Max dimension size of a grid size (x,y,z):    (65535, 65535, 1)
  Texture alignment:                             256 bytes
  Maximum memory pitch:                          2147483647 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   No
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      No
cuDeviceGetAttribute returned 1
-> CUDA_ERROR_INVALID_VALUE

The nvidia driver (340.102) only support version 6.5, but no issue building
for cuda 7.5 and 8.
</Additionnal informations (failing builder)>

If I switch to a newer machine (same OS, just different card and Nvidia driver),
the build does through and check pass!

Bottom line, for cuda9(only?) one might need to build on the target machine,
not on a legacy one, of course ymmv.

Cheers

Tru

<Additionnal info (successfull builder)>
[tru@borma build-cuda-9.0.176_384.81]$ deviceQueryDrv 
deviceQueryDrv Starting...

CUDA Device Query (Driver API) statically linked version 
Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version:                           9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11172 MBytes (11714691072 
bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1582 MHz (1.58 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 
3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)
  Texture alignment:                             512 bytes
  Maximum memory pitch:                          2147483647 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Concurrent kernel execution:                   Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 6 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device 
simultaneously) >
Result = PASS

</Additionnal info (successfull builder)>

-- 
Dr Tru Huynh | mailto:t...@pasteur.fr | tel/fax +33 1 45 68 87 37/19
https://research.pasteur.fr/en/team/structural-bioinformatics/
Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France  
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to