John: thanks for the link. Curiously, sinfo doesn't show the asterisk, but
has it documented. scontrol shows the asterisk and doesn't document it...
at least for the state my cluster is in.
Antony: Thanks for the steps- I tried it out, but there was no change. It
seems like it should do the tri
If it is any help, https://slurm.schedmd.com/sinfo.html
NODE STATE CODES
Node state codes are shortened as required for the field size. These node
states may be followed by a special character to identify state flags
associated with the node. The following node sufficies and states are used:
***
I've not seen the IDLE* issue before but when my nodes got stuck I've
always beena ble to fix them with this:
[root@cloud01 ~]# scontrol update nodename=cloud01 state=down reason=stuck
[root@cloud01 ~]# scontrol update nodename=cloud01 state=idle
[root@cloud01 ~]# scontrol update nodename=cloud01
Hi
I'm running a cluster in a cloud provider and have run up against an odd
problem with power save. I've got several hundred nodes that Slurm won't
power up even though they appear idle and in the powered-down state. I
suspect that they are in a "not-so-idle" state: `scontrol` for all of the
no