> if I remember right, if you use cgroups, CUDA_VISIBLE_DEVICES always
> starts from zero. So this is NOT the index of the GPU.
Thanks. Just FYI, when I tested the environment variables with Slurm 19.05.2 +
proctrack/cgroup configuration, It looks CUDA_VISIBLE_DEVICES fits the indices
on the hos
Hi!
I'm trying to install Slurm 20.02 on my cluster with the GPU features. However,
only my compute nodes have GPUs attached and so when I try to install the
slurm-slurmctld RPM on my head node it fails saying it requires the NVIDIA
control software. How do other folks work around this? Do you
Hi Sean,
That sounds like a workable solution, thanks for the suggestion!
I was hoping that there was something else I'd missed in the docs that
let's you do it directly via sacctmgr without having to edit slurm.conf as
well.
Thanks again,
Paddy
On Sat, Jun 20, 2020 at 09:20:02AM +1000, Sean Cr
It needs only the software (for Sliurm 19 it was NVML library) so I have copied
over the said library on all my non-GPU nodes.
Grigory Shamov
University of Manitoba
From: slurm-users on behalf of
Petrillo, Neale A. (Contractor)
Sent: Monday, June 22, 2020 10
In fairness to our friends at SchedMD, this was filed as an enhancement
request, not a bug.
Since this is an open source project, there are 2 good ways to make it happen:
1. Fund someone, like SchedMD, to make the change.
2. Make the changes yourself, and submit the changes.
Alter
For the record we filed a bug on this years ago:
https://bugs.schedmd.com/show_bug.cgi?id=3875 Hasn't been fixed yet
though everyone seems to agree it is a good idea.
Florian's suggestion is probably the best stopgap until this feature is
implemented.
-Paul Edmon-
On 6/22/2020 7:11 AM, Flo
Il 16/06/20 16:23, Loris Bennett ha scritto:
> Thanks for pointing this out - I hadn't been aware of this. Is there
> anywhere in the documentation where this is explicitly stated?
I don't remember. Seems Michael's experience is different. Possibly some
other setting influences that behaviour. Ma
Durai,
To overcome this, we use noXXX features like below. Users can then request
“8268&noGPU&EDR” to select nodes with 8268s on EDR without GPUs for example.
# scontrol show node node5000 |grep AvailableFeatures
AvailableFeatures=192GB,2933MHz,SD530,Platinum,8268,rack25,EDR,sb7890_0416,enc2
Hi,
The sbatch/srun commands have the "--constraint" option to select nodes
with certain features. With this you can specify AND, OR, matching OR
operators. But there is no NOT operator. How do you exclude nodes with a
certain feature in the "--constraint" option? Or is there another option
that c