I'm pretty sure that a single, central slurmdbd service is required for
multiple, federated clusters. I think that's what ties multiple
clusters together into a single "federation".
You mention a problem with squeue, but you don't list the error
messages. Are you sure that all nodes have i
"Hwa, George" writes:
> In example gres.conf,
>
>Name=gpu File=/dev/nvidia0
>
> Does slurm actually read the device file and get information from it
> for configuration/control?
It seems to me that it at least will do an existence check. If the GPU
device files are not there (e.g. drivers n
Allan, thanks for the explanation.
So the name "gpu" is not that generic. It actually means something to SLURM, or
its plugin.
This leads me another related question: how do I define a generic resource that
isn't associated with any real device?
Regards,
George
> -Original Message
Allan, thanks for the explanation.
So the name "gpu" is not that generic. It actually means something to SLURM, or
its plugin.
This leads me another related question: how do I define a generic resource that
isn't associated with any real device?
Regards,
George
> -Original Message
Slurm versions 16.05.11, 17.02.9 and 17.11.0rc2 are now available, and
include a series of recent bug fixes as well as a fix for a recently
discovered security vulnerability (CVE-2017-15566).
Downloads are available at https://www.schedmd.com/downloads.php .
Ryan Day (LLNL) reported an issue
hello,
I want to limit the memory per cpu in my cluster, some settings in
slurm.conf are like this:
NodeName=c[01-10] CPUs=32 RealMemory=127360
PartitionName=C032 Nodes=c[01-10] MaxMemPerCPU=3980 DefMemPerCPU=3980
MaxCPUsPerNode=32
I have 10 nodes, each node has 127360M memory, and 32 CP
On 02/11/17 14:34, 马银萍 wrote:
> It means that he used only one cpu and asked for 125G memoey, so he used
> most of the memory on that node, then it will affect other user's job,
> this is invalid.
> So is there any way to strictly limit the avarage memory per CPU and
> users can't override it? o
hi,
I'll try to test it again.
Thank you for your help, Ole
Best regards.
zhangtao102...@126.com
From: Ole Holm Nielsen
Date: 2017-11-01 15:36
To: slurm-dev
Subject: [slurm-dev] Re: question about federation
I'm pretty sure that a single, central slurmdbd service is required for
multiple,