Dear Team,
I created a small cluster of 3 nodes on my VM ware to work on the CPU
utilization concept.
I created a user name= hpcuser01, and allocated GrpTresMin=cpu=5940 -> CPU
minutes and gpu=0
Now, when I checked his utilization using scontrol association cmd
# scontrol show ass user=hp
Dear slurm-user list,
I had cases where our resumeProgram failed due to temporary cloud
timeouts. In that case the resumeProgram returns a value =/= 0. Why does
Slurm still wait until resumeTimeout instead of just accepting the
startup as failed which then should lead to a rescheduling of the job
Hi,
I need to take care of a 17.02 Slurm cluster (I'm preparing it for
upgrades). I see that slurmdbd logs various "cluster not registered"
messages at startup (DBD_CLUSTER_TRES,DBD_JOB_START,DBD_STEP_START), but
I don't see a real problem. Accounting works. Do I have to worry? Can
this be re