As I understand it, that setting means "Always have at least X nodes
up", which includes running jobs. So it stops any wait time for the
first X jobs being submitted, but any jobs after that will need to wait
for the power_up sequence.
Brian Andrus
On 11/22/2023 6:58 AM, Davide DelVento wrote:
I've started playing with powersave and have a question about
SuspendExcNodes. The documentation at
https://slurm.schedmd.com/power_save.html says
For example |nid[10-20]:4| will prevent 4 usable nodes (i.e IDLE and
not DOWN, DRAINING or already powered down) in the set
|nid[10-20]| from being powered down.
I initially interpreted that as "Slurm will try to keep 4 nodes idle
on as much as possible", which would have reduced the wait time for
new jobs targeting those nodes. Instead, it appears to mean "Slurm
will not shut off the last 4 nodes which are idle in that partition,
however it will not turn on nodes which it shut off earlier unless
jobs are scheduled on them"
Most notably if the 4 idle nodes will be allocated to other jobs (and
so they are no idle anymore) slurm does not turn on any nodes which
have been shut off earlier, so it's possible (and depending on
workloads perhaps even common) to have no idle nodes on regardless of
the SuspendExcNode settings.
Is that how it works, or do I have anything else in my setting which
is causing this unexpected-to-me behavior? I think I can live with it,
but IMHO it would have been better if slurm attempted to turn on nodes
preemptively trying to match the requested SuspendExcNodes, rather
than waiting for job submissions.
Thanks and Happy Thanksgiving to people in the USA