Re: [slurm-users] disable-bindings disables counting of gres resources

2019-04-15 Thread Peter Steinbach
Hi Chris, thanks for the detailed feedback. This is slurm 18.08.5, see also https://github.com/psteinb/docker-centos7-slurm/blob/7bdb89161febacfd2dbbcb3c5684336fb73d7608/Dockerfile#L9 Best, Peter smime.p7s Description: S/MIME Cryptographic Signature

Re: [slurm-users] disable-bindings disables counting of gres resources

2019-04-15 Thread Christopher Samuel
On 4/15/19 8:15 AM, Peter Steinbach wrote: We had a feeling that cgroups might be more optimal. Could you point us to documentation that suggests cgroups to be a requirement? Oh it's not a requirement, just that without it there's nothing to stop a process using GPUs outside of its allocation

Re: [slurm-users] disable-bindings disables counting of gres resources

2019-04-15 Thread Peter Steinbach
Hi Chris, thanks for following up on this thread. First of all, you will want to use cgroups to ensure that processes that do not request GPUs cannot access them. We had a feeling that cgroups might be more optimal. Could you point us to documentation that suggests cgroups to be a requireme

Re: [slurm-users] disable-bindings disables counting of gres resources

2019-04-13 Thread Chris Samuel
On Monday, 25 March 2019 2:30:34 AM PDT Peter Steinbach wrote: > I observed a weird behavior of the '--gres-flags=disable-binding' > option. With the above .conf files, I created a local slurm cluster with > 3 computes (2 GPUs and 4 cores each). First of all, you will want to use cgroups to ensur

Re: [slurm-users] disable-bindings disables counting of gres resources

2019-04-05 Thread Quirin Lohr
Same problem here: a Job submitted with gres-flags=disable-bindings is assigned a node, but then the job step fails because all GPUs on that node are already in use. Log messages: [2019-04-05T15:29:05.216] error: gres/gpu: job 92453 node node5 overallocated resources by 1, (9 > 8) [2019-04-05

Re: [slurm-users] disable-bindings disables counting of gres resources

2019-03-29 Thread Peter Steinbach
Just to follow up, I filed a medium bug report with schedmd on this: https://bugs.schedmd.com/show_bug.cgi?id=6763 Best, Peter On 3/25/19 10:30 AM, Peter Steinbach wrote: Dear all, Using these config files, https://github.com/psteinb/docker-centos7-slurm/blob/7bdb89161febacfd2dbbcb3c5684336fb

[slurm-users] disable-bindings disables counting of gres resources

2019-03-25 Thread Peter Steinbach
Dear all, Using these config files, https://github.com/psteinb/docker-centos7-slurm/blob/7bdb89161febacfd2dbbcb3c5684336fb73d7608/gres.conf https://github.com/psteinb/docker-centos7-slurm/blob/7bdb89161febacfd2dbbcb3c5684336fb73d7608/slurm.conf I observed a weird behavior of the '--gres-flags=