[slurm-users] How to show state of CLOUD nodes

2020-02-27 Thread Carter, Allan
I'm setting up an EC2 SLURM cluster and when an instance doesn't resume fast enough I get an error like: node c7-c5-24xl-464 not resumed by ResumeTimeout(600) - marking down and power_save I keep running into issues where my cloud nodes do not show up in sinfo and I can't display their informa

Re: [slurm-users] How to show state of CLOUD nodes

2020-02-28 Thread Carter, Allan
down, etc. Thanks for the help. I think it will solve the issues I’m having. From: Kirill 'kkm' Katsnelson [mailto:k...@pobox.com] Sent: Friday, February 28, 2020 5:56 AM To: Slurm User Community List Cc: Carter, Allan Subject: Re: [slurm-users] How to show state of CLOUD nodes I

[slurm-users] Job are pending when plenty of resources available

2020-03-29 Thread Carter, Allan
I'm perplexed. My cluster has been churning along and tonight it has decided to start pending jobs even though there are plenty of nodes available. An example job from squeue: JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 409978 interactiver

[slurm-users] Preemption for licenses

2022-12-09 Thread Carter, Allan
If a job is pending only because it needs a license and all are being used, can it preempt jobs in a lower priority partition that are using the license? Or does preemption only work for compute resources. I've tried to configure preemption, but when I submit a job that used my only license and