date:20240212

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Fulcomer, Samuel via slurm-users

We'd bumped ours up for a while 20+ years ago when we had a flaky network connection between two buildings holding our compute nodes. If you need more than 600s you have networking problems. On Mon, Feb 12, 2024 at 5:41 PM Timony, Mick via slurm-users < slurm-users@lists.schedmd.com> wrote: > We

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Timony, Mick via slurm-users

We set SlurmdTimeout=600. The docs say not to go any higher than 65533 seconds: https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmdTimeout The FAQ has info about SlurmdTimeout also. The worst thing that could happen is will take longer to set nodes as being down: >A node is set DOWN when the s

[slurm-users] Re: simple question, I guess… from a newbie sysadmin

2024-02-12 Thread Jess Arrington via slurm-users

Hi Richard, I hope your day is treating you well. Thank you for your posts on the Slurm user list. Would there be interest on your side to see a Slurm support contract for your systems at University of Nantes? Sites running Slurm with support give us feedback that support is invaluable and a

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Bjørn-Helge Mevik via slurm-users

We've been running one cluster with SlurmdTimeout = 1200 sec for a couple of years now, and I haven't seen any problems due to that. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature -- slurm-users mailin

[slurm-users] simple question, I guess… from a newbie sysadmin

2024-02-12 Thread Richard Randriatoamanana via slurm-users

Hi, I am trying to help a sysadmin colleague (and to understand for myself) trying to configure a new slurm server and he struggles to understand if there is an alternative way to config slurm managing job policy submission per user without necessarily installing an accounting mariadb service.

[slurm-users] Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Andrew Baughan (ITCS - Staff) via slurm-users

Hi, We've been experiencing issues with network saturation on our older nodes caused by storage (GPFS) backups. This causes slurmctld to loose contact with slurmd on some compute nodes and for user jobs to be killed. While the longer term solution is to replace these and upgrade the network, I'

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

[slurm-users] Re: simple question, I guess… from a newbie sysadmin

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

[slurm-users] simple question, I guess… from a newbie sysadmin

[slurm-users] Increasing SlurmdTimeout beyond 300 Seconds

6 matches

Site Navigation

Mail list logo

Footer information