Re: [slurm-users] How to limit # of execution slots for a given node

2022-01-06 Thread Rémi Palancher
Le jeudi 6 janvier 2022 à 22:39, David Henkemeyer a écrit : > All, > > When my team used PBS, we had several nodes that had a TON of CPUs, so many, > in fact, that we ended up setting np to a smaller value, in order to not > starve the system of memory. > > What is the best way to do this with

Re: [slurm-users] How to limit # of execution slots for a given node

2022-01-06 Thread Ole Holm Nielsen
Hi David, On 1/6/22 22:39, David Henkemeyer wrote: When my team used PBS, we had several nodes that had a TON of CPUs, so many, in fact, that we ended up setting np to a smaller value, in order to not starve the system of memory. What is the best way to do this with Slurm?  I tried modifying

Re: [slurm-users] SlurmDBD 20.02.7

2022-01-06 Thread Danny Marc Rotscher
Hi Bas, thank you very much for linking the bug report, the following solution mentioned there helps me to solve the problem. 1. Stop slurmdbd systemctl stop slurmdbd We running SlurmDBD in Docker with Supervisord, so we have to stop it like follows. supervisorctl stop slurmdbd 2. Modify the

[slurm-users] How to limit # of execution slots for a given node

2022-01-06 Thread David Henkemeyer
All, When my team used PBS, we had several nodes that had a TON of CPUs, so many, in fact, that we ended up setting np to a smaller value, in order to not starve the system of memory. What is the best way to do this with Slurm? I tried modifying # of CPUs in the slurm.conf file, but I noticed th

Re: [slurm-users] Use gres to handle permissions of /dev/dri/card* and /dev/dri/renderD*?

2022-01-06 Thread Stephan Roth
Hi Martin, My (quick and unrefined) thoughts about this: This could only work if you don't have ConstrainDevices=yes in your cgroup.conf. Which I don't think is a good idea, as jobs can use GPUs allocated to other jobs. Let's assume you don't use ConstrainDevices=yes: The GPU's allocated to

Re: [slurm-users] Use gres to handle permissions of /dev/dri/card* and /dev/dri/renderD*?

2022-01-06 Thread Martin Pecka
Hello, I'm reviving a bit of old thread, but I just noticed I don't see my January 2021 message in the archives, so I'm sending it again now that the issue again got live on our side. To quickly recap, we want to add permissions not only to /dev/nvidia* devices based on the requested gres, bu

Re: [slurm-users] SlurmDBD 20.02.7

2022-01-06 Thread Bas van der Vlies
Hi Danny, We had the same issue when we upgraded slurm to 20.11 but maybe the solution also works for you: * https://bugs.schedmd.com/show_bug.cgi?id=12947 On 06/01/2022 15:36, Danny Marc Rotscher wrote: Hello everyone, today we update our Slurm database daemon from 20.02.2 to 20.02.7 and

[slurm-users] SlurmDBD 20.02.7

2022-01-06 Thread Danny Marc Rotscher
Hello everyone, today we update our Slurm database daemon from 20.02.2 to 20.02.7 and everything works except when I try to delete a user it does not work. sacctmgr -i delete user name=xyz account=xyz sacctmgr: slurmdbd: No error Nothing deleted slurmdbd.log: slurmdbd_1 | slurmdbd: error: mysql

[slurm-users] SlurmDBD 20.02.7

2022-01-06 Thread Danny Marc Rotscher
Hello everyone, today we update our Slurm database daemon from 20.02.2 to 20.02.7 and everything works except when I try to delete a user it does not work. sacctmgr -i delete user name=xyz account=xyz sacctmgr: slurmdbd: No error Nothing deleted slurmdbd.log: slurmdbd_1 | slurmdbd: error: mysq