FWIW, I am still open to implementing something to workaround this in hwloc. Could be shell variable such as HWLOC_DISABLE_NVML=yes for all our major configured dependencies.
Brice Le 24/10/2016 02:12, Gilles Gouaillardet a écrit : > Justin, > > > iirc, NVML is only used by hwloc (e.g. not by CUDA) and there is no > real benefit for having that. > > as a workaround, you can > > export enable_nvml=no > > and then configure && make install > > Cheers, > > Gilles > > On 10/20/2016 12:49 AM, Jeff Squyres (jsquyres) wrote: >> Justin -- >> >> Fair point. Can you work with Sylvain Jeaugey (at Nvidia) to submit >> a pull request for this functionality? >> >> Thanks. >> >> >>> On Oct 18, 2016, at 2:26 PM, Justin Luitjens <jluitj...@nvidia.com> >>> wrote: >>> >>> After looking into this a bit more it appears that the issue is I am >>> building on a head node which does not have the driver installed. >>> Building on back node resolves this issue. In CUDA 8.0 the NVML >>> stubs can be found in the toolkit at the following path: >>> ${CUDA_HOME}/lib64/stubs >>> For 8.0 I’d suggest updating the configure/make scripts to look >>> for nvml there and link in the stubs. This way the build is not >>> dependent on the driver being installed and only the toolkit. >>> Thanks, >>> Justin >>> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of >>> Justin Luitjens >>> Sent: Tuesday, October 18, 2016 9:53 AM >>> To: users@lists.open-mpi.org >>> Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0 >>> I have the release version of CUDA 8.0 installed and am trying to >>> build OpenMPI. >>> Here is my configure and build line: >>> ./configure --prefix=$PREFIXPATH --with-cuda=$CUDA_HOME --with-tm= >>> --with-openib= && make && sudo make install >>> Where CUDA_HOME points to the cuda install path. >>> When I run the above command it builds for quite a while but >>> eventually errors out wit this: >>> make[2]: Entering directory >>> `/home/jluitjens/Perforce/jluitjens_dtlogin_p4sw/sw/devrel/DevtechCompute/Internal/Tools/dtlogin/scripts/mpi/openmpi-1.10.1-gcc5.0_2014_11-cuda8.0/opal/tools/wrappers' >>> CCLD opal_wrapper >>> ../../../opal/.libs/libopen-pal.so: undefined reference to >>> `nvmlInit_v2' >>> ../../../opal/.libs/libopen-pal.so: undefined reference to >>> `nvmlDeviceGetHandleByIndex_v2' >>> ../../../opal/.libs/libopen-pal.so: undefined reference to >>> `nvmlDeviceGetCount_v2' >>> Any idea what I might need to change to get around this error? >>> Thanks, >>> Justin >>> This email message is for the sole use of the intended recipient(s) >>> and may contain confidential information. Any unauthorized review, >>> use, disclosure or distribution is prohibited. If you are not the >>> intended recipient, please contact the sender by reply email and >>> destroy all copies of the original message. >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users