typing error, should be --> **located at /usr/include/nvml.h**
On Wed, Apr 14, 2021 at 5:47 PM Cristóbal Navarro <
cristobal.navarr...@gmail.com> wrote:
> Hi community,
> I have set up the configuration files as mentioned in the documentation,
> but the slurmd of the GPU-compute node fails with t
Hi community,
I have set up the configuration files as mentioned in the documentation,
but the slurmd of the GPU-compute node fails with the following error shown
in the log.
After reading the slurm documentation, it is not entirely clear to me how
to properly set up GPU autodetection for the gres.
Dear distinguished list,
I am new to SLURM. I have recently installed SLURM 20.11.3 on two separate
three node clusters. The first cluster was for testing purposes using
three small RHEL 7.7 VMs (8 core, 8G RAM). After a successful installation
and some sbatch testing, I proceeded to the second
Before you get all excited about it, we have had a terrible time trying to get
gppu metrics. Finally abandoned and switch to Grafana, Prometheus influx.
Good luck to you though.
From: slurm-users on behalf of "Heckes,
Frank"
Reply-To: Slurm User Community List
Date: Wednesday, April 14,
Oh and I forgot to mention that we are using Slurm version 20.11.3.
Best,
Thomas
ons, 14 04 2021 kl. 09:23 +0200, skrev Thomas Arildsen:
I administer a Slurm cluster with many users and the operation of the
cluster currently appears "totally normal" for all users; except for
one. This one user
I administer a Slurm cluster with many users and the operation of the
cluster currently appears "totally normal" for all users; except for
one. This one user gets all attempts to run commands through Slurm
killed after 20-25 seconds (I think the cause is another job - not so
much the time, see furt