Dear all,
the Slurm official documentation say that: Trigger events are not processed
instantly, but a check is performed for trigger events on a periodic basis
(currently every 15 seconds).
https://slurm.schedmd.com/strigger.html
Is it possible to reduce this time?
Thank You.
The more flexible way to do this is with QoS. (PreemptType=preempt/qos) You'll
need to have Accounting enabled and you'll probably want qos listed in
AccountingStorageEnforce. Once you do that you create a "shared" for the
scavenger jobs, a QoS for each group that buys into resources. Assign the
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains about an invalid credential for munge. I've
verified that the munge.key is the same as on all other nodes with
sudo cksum /etc/munge/munge.key
I recopied a munge.key from a node that works. I've ver
Two trivial things to check:
1. Permissions on /etc/munge and /etc/munge.key
2. Is munged running on the problem node?
Andy
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
Dean Schulze
Sent: Wednesday, April 15, 2020 1:57 PM
To: Slurm User Community Li
I’d check ntp as your encoding time seems odd to me
On Wed, 15 Apr 2020 at 19:59, Dean Schulze wrote:
> I've installed two new nodes onto my slurm cluster. One node works, but
> the other one complains about an invalid credential for munge. I've
> verified that the munge.key is the same as on
The default value for TmpDisk is 0, so if you want local scratch available on a
node, the amount of TmpDisk space must be defined in the node configuration in
slurm.conf.
example:
NodeName=TestNode01 CPUs=8 Boards=1 SocketsPerBoard=2 CoresPerSocket=4
ThreadsPerCore=1 RealMemory=24099 TmpDisk=1
/etc/munge is 700
/etc/munge/munge.key is 400
On Wed, Apr 15, 2020 at 12:11 PM Riebs, Andy wrote:
> Two trivial things to check:
>
> 1. Permissions on /etc/munge and /etc/munge.key
>
> 2. Is munged running on the problem node?
>
>
>
> Andy
>
>
>
> *From:* slurm-users [mailto:slurm-
Hi Slurm-Users,
Hope this post finds all of you healthy and safe amidst the ongoing COVID19
craziness. We've got a strange error state that occurs when we enable
preemption and we need help diagnosing what is wrong. I'm not sure if we
are missing a default value or other necessary configuration, b
Who owns the munge directory and key? Is it the right uid/gid? Is the munge
daemon running?
--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia
On Thu, 16 Apr 2020 at 04:57, Dean Schulz
Thanks Erik.
Last night i made the changes.
i defined in slurm.conf on all the nodes as well as on the slurm server.
TmpFS=/lscratch
NodeName=node[01-10] CPUs=44 RealMemory=257380 Sockets=2
CoresPerSocket=22 ThreadsPerCore=1 TmpDisk=160 State=UNKNOWN
Feature=P4000 Gres=gpu:2
These nodes
You might want to check the Munge section in my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#munge-authentication-service
/Ole
On 15-04-2020 19:57, Dean Schulze wrote:
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains abou
On 4/15/20 10:57 am, Dean Schulze wrote:
error: Munge decode failed: Invalid credential
ENCODED: Wed Dec 31 17:00:00 1969
DECODED: Wed Dec 31 17:00:00 1969
error: authentication: Invalid authentication credential
That's really interesting, I had one of these last week when on call,
fo
12 matches
Mail list logo