Jim,I’m glad you got your problem solved. Here is an additional tip that will make it easier to fix in the future. You don’t need to put scrontrol into a loop, the NodeName parameter will take a node range _expression_. So, you can use NodeName=sjc01enadsapp[01-08]. A SysAdmin in training saw me do
I am seeing what I think might be a bug with sacct. When I do the
*> sbatch --export=NONE --wrap='uname -a' --exclusive*
*Submitted batch job 2869585*
Then, I ask sacct for the SubmitLine, as such:
*> sacct -j 2869586 -o
On Wednesday, 04 May 2022, at 10:00:57 (-0700),
David Henkemeyer wrote:
I am seeing what I think might be a bug with sacct. When I do the
*> sbatch --export=NONE --wrap='uname -a' --exclusive*
*Submitted batch job 2869585*
Then, I ask sacct for the SubmitLine, as such:
*> sacc
Slurm versions 21.08.8 and 20.11.9 are now available to address a
critical security issue with Slurm's authentication handling.
SchedMD customers were informed on April 20th and provided a patch on
request; this process is documented in our security policy [1].
For SchedMD customers: please n
Thank you, Michael! In fact, it appears as though Slurm is storing the
entire commandline as a single "word":
(! )-> sacct -j 2871474 -o "SubmitLine%-100"
sbatch --export=NONE --wrap
We have a node given to a group that has all the 2 GPUs in dedicated mode
by setting reservation on the node for 6 months. We want to find out GPU
hours weekly utilization of that particular reserved node. The node is not
in to seperate partition.
Below command does not help in showing the a
I am wondering what is the best way to update node changes, such as
addition and removal of nodes to SLURM. The excerpts below suggest a full
restart, can someone confirm this? or perhaps `*scontrol reconfigure | kill
-s SIGHUP*` does it?
best wishes: steven
// src/slurmctld/read_config.c
On 5/4/22 7:26 pm, Steven Varga wrote:
I am wondering what is the best way to update node changes, such as
addition and removal of nodes to SLURM. The excerpts below suggest a
full restart, can someone confirm this?
You are correct, you need to restart slurmctld and slurmd daemons at