On 1/19/23 5:01 am, Stefan Staeglich wrote:
Hi,
Hiya,
I'm wondering where the UnkillableStepProgram is actually executed. According
to Mike it has to be available on every on the compute nodes. This makes sense
only if it is executed there.
That's right, it's only executed on compute nodes.
But the man page slurm.conf of 21.08.x states:
UnkillableStepProgram
Must be executable by user SlurmUser. The file must be
accessible by the primary and backup control machines.
So I would expect it's executed on the controller node.
That's strange, my slurm.conf man page from a system still running 21.08
says:
UNKILLABLE STEP PROGRAM SCRIPT
This program can be used to take special actions to clean up
the unkillable processes and/or notify system administrators.
The program will be run as SlurmdUser (usually "root") on
the compute node where UnkillableStepTimeout was triggered.
Ah, I see, there's a later "FILE AND DIRECTORY PERMISSIONS" part which
has the text that you've found - that part's wrong! :-)
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA