FWIW, I am still open to implementing something to workaround this in hwloc.
Could be shell variable such as HWLOC_DISABLE_NVML=yes for all our major
configured dependencies.

Brice



Le 24/10/2016 02:12, Gilles Gouaillardet a écrit :
> Justin,
>
>
> iirc, NVML is only used by hwloc (e.g. not by CUDA) and there is no
> real benefit for having that.
>
> as a workaround, you can
>
> export enable_nvml=no
>
> and then configure && make install
>
> Cheers,
>
> Gilles
>
> On 10/20/2016 12:49 AM, Jeff Squyres (jsquyres) wrote:
>> Justin --
>>
>> Fair point.  Can you work with Sylvain Jeaugey (at Nvidia) to submit
>> a pull request for this functionality?
>>
>> Thanks.
>>
>>
>>> On Oct 18, 2016, at 2:26 PM, Justin Luitjens <jluitj...@nvidia.com>
>>> wrote:
>>>
>>> After looking into this a bit more it appears that the issue is I am
>>> building on a head node which does not have the driver installed. 
>>> Building on back node resolves this issue.  In CUDA 8.0 the NVML
>>> stubs can be found in the toolkit at the following path: 
>>> ${CUDA_HOME}/lib64/stubs
>>>   For 8.0 I’d suggest updating the configure/make scripts to look
>>> for nvml there and link in the stubs.  This way the build is not
>>> dependent on the driver being installed and only the toolkit.
>>>   Thanks,
>>> Justin
>>>   From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
>>> Justin Luitjens
>>> Sent: Tuesday, October 18, 2016 9:53 AM
>>> To: users@lists.open-mpi.org
>>> Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0
>>>   I have the release version of CUDA 8.0 installed and am trying to
>>> build OpenMPI.
>>>   Here is my configure and build line:
>>>   ./configure --prefix=$PREFIXPATH --with-cuda=$CUDA_HOME --with-tm=
>>> --with-openib= && make && sudo make install
>>>   Where CUDA_HOME points to the cuda install path.
>>>   When I run the above command it builds for quite a while but
>>> eventually errors out wit this:
>>>   make[2]: Entering directory
>>> `/home/jluitjens/Perforce/jluitjens_dtlogin_p4sw/sw/devrel/DevtechCompute/Internal/Tools/dtlogin/scripts/mpi/openmpi-1.10.1-gcc5.0_2014_11-cuda8.0/opal/tools/wrappers'
>>>    CCLD     opal_wrapper
>>> ../../../opal/.libs/libopen-pal.so: undefined reference to
>>> `nvmlInit_v2'
>>> ../../../opal/.libs/libopen-pal.so: undefined reference to
>>> `nvmlDeviceGetHandleByIndex_v2'
>>> ../../../opal/.libs/libopen-pal.so: undefined reference to
>>> `nvmlDeviceGetCount_v2'
>>>     Any idea what I might need to change to get around this error?
>>>   Thanks,
>>> Justin
>>> This email message is for the sole use of the intended recipient(s)
>>> and may contain confidential information.  Any unauthorized review,
>>> use, disclosure or distribution is prohibited.  If you are not the
>>> intended recipient, please contact the sender by reply email and
>>> destroy all copies of the original message.
>>> _______________________________________________
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to