We'd bumped ours up for a while 20+ years ago when we had a flaky
network connection between two buildings holding our compute nodes. If you
need more than 600s you have networking problems.
On Mon, Feb 12, 2024 at 5:41 PM Timony, Mick via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> We
We set SlurmdTimeout=600. The docs say not to go any higher than 65533 seconds:
https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmdTimeout
The FAQ has info about SlurmdTimeout also. The worst thing that could happen is
will take longer to set nodes as being down:
>A node is set DOWN when the s
Hi Richard,
I hope your day is treating you well.
Thank you for your posts on the Slurm user list.
Would there be interest on your side to see a Slurm support contract for
your systems at University of Nantes?
Sites running Slurm with support give us feedback that support is
invaluable and a
We've been running one cluster with SlurmdTimeout = 1200 sec for a
couple of years now, and I haven't seen any problems due to that.
--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
signature.asc
Description: PGP signature
--
slurm-users mailin
Hi,
I am trying to help a sysadmin colleague (and to understand for myself) trying
to configure a new slurm server and he struggles to understand if there is an
alternative way to config slurm managing job policy submission per user without
necessarily installing an accounting mariadb service.
Hi,
We've been experiencing issues with network saturation on our older nodes
caused by storage (GPFS) backups. This causes slurmctld to loose contact with
slurmd on some compute nodes and for user jobs to be killed. While the longer
term solution is to replace these and upgrade the network, I'