I just ran into this issue. Specifically, SLURM looks for the NVML header file, 
which comes with CUDA or DCGM, in addition to the library that comes with the 
drivers. The check is at 
https://github.com/SchedMD/slurm/blob/a763a008b7700321b51aad2e619deab00638a379/auxdir/x_ac_nvml.m4#L32.
 Once you’ve built SLURM, it’s enough to just have the GPU drivers on the nodes 
where SLURM will be installed.

On Apr 8, 2020, at 9:32 AM, 
dean.w.schu...@gmail.com<mailto:dean.w.schu...@gmail.com> wrote:

I believe in order to compile for nvml you'll have to compile on a system with 
an Nvidia gpu installed otherwise the Nvidia driver and libraries won't install 
on that system.

-----Original Message-----
From: slurm-users 
<slurm-users-boun...@lists.schedmd.com<mailto:slurm-users-boun...@lists.schedmd.com>>
 On Behalf Of Christopher Samuel
Sent: Tuesday, April 7, 2020 10:08 PM
To: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Header lengths are longer than data received after 
changing SelectType & GresTypes to use MPS

On 4/7/20 2:48 PM, Robert Kudyba wrote:

How can I get this to work by loading the correct Bright module?

You can't - you will need to recompile Slurm.

The error says:

Apr 07 16:52:33 node001 slurmd[299181]: fatal: We were configured to autodetect 
nvml functionality, but we weren't able to find that lib when Slurm was 
configured.

So when Slurm was built the libraries you are telling it to use now were not 
detected and so the configure script disabled that functionality as it would 
not otherwise have been able to compile.

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA




Reply via email to