Brice,
unless you want to enable/disable nvml at runtime, and assuming we do
not need nvml in Open MPI,
and IMHO, the easiest workaround is to update
https://github.com/open-mpi/ompi/blob/master/opal/mca/hwloc/hwloc1113/configure.m4
and add the oneliner
enable_nvml=no
a better option could be to update
https://github.com/open-mpi/ompi/blob/master/opal/mca/hwloc/configure.m4
and pass the --enable-nvml option from Open MPI down to hwloc.
Cheers,
Gilles
On 10/24/2016 4:45 PM, Brice Goglin wrote:
FWIW, I am still open to implementing something to workaround this in hwloc.
Could be shell variable such as HWLOC_DISABLE_NVML=yes for all our major
configured dependencies.
Brice
Le 24/10/2016 02:12, Gilles Gouaillardet a écrit :
Justin,
iirc, NVML is only used by hwloc (e.g. not by CUDA) and there is no
real benefit for having that.
as a workaround, you can
export enable_nvml=no
and then configure && make install
Cheers,
Gilles
On 10/20/2016 12:49 AM, Jeff Squyres (jsquyres) wrote:
Justin --
Fair point. Can you work with Sylvain Jeaugey (at Nvidia) to submit
a pull request for this functionality?
Thanks.
On Oct 18, 2016, at 2:26 PM, Justin Luitjens <jluitj...@nvidia.com>
wrote:
After looking into this a bit more it appears that the issue is I am
building on a head node which does not have the driver installed.
Building on back node resolves this issue. In CUDA 8.0 the NVML
stubs can be found in the toolkit at the following path:
${CUDA_HOME}/lib64/stubs
For 8.0 I’d suggest updating the configure/make scripts to look
for nvml there and link in the stubs. This way the build is not
dependent on the driver being installed and only the toolkit.
Thanks,
Justin
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
Justin Luitjens
Sent: Tuesday, October 18, 2016 9:53 AM
To: users@lists.open-mpi.org
Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0
I have the release version of CUDA 8.0 installed and am trying to
build OpenMPI.
Here is my configure and build line:
./configure --prefix=$PREFIXPATH --with-cuda=$CUDA_HOME --with-tm=
--with-openib= && make && sudo make install
Where CUDA_HOME points to the cuda install path.
When I run the above command it builds for quite a while but
eventually errors out wit this:
make[2]: Entering directory
`/home/jluitjens/Perforce/jluitjens_dtlogin_p4sw/sw/devrel/DevtechCompute/Internal/Tools/dtlogin/scripts/mpi/openmpi-1.10.1-gcc5.0_2014_11-cuda8.0/opal/tools/wrappers'
CCLD opal_wrapper
../../../opal/.libs/libopen-pal.so: undefined reference to
`nvmlInit_v2'
../../../opal/.libs/libopen-pal.so: undefined reference to
`nvmlDeviceGetHandleByIndex_v2'
../../../opal/.libs/libopen-pal.so: undefined reference to
`nvmlDeviceGetCount_v2'
Any idea what I might need to change to get around this error?
Thanks,
Justin
This email message is for the sole use of the intended recipient(s)
and may contain confidential information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and
destroy all copies of the original message.
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users