Re: [slurm-users] Need to restart slurmctld for gres jobs to start

2022-06-24 Thread Ryan Novosielski
On 6/2/22 14:02, tluchko wrote: Hello, I have recently started to have problems where jobs sit in the queue waiting for resources to become available, even when the resources are available. If I stop and restart slurmctld, the pending jobs start running. This seems to be related to GRES jobs

Re: [slurm-users] Need to restart slurmctld for gres jobs to start

2022-06-02 Thread Bjørn-Helge Mevik
tluchko writes: > Jobs only sit in the queue with RESOURCES as the REASON when we > include the flag --gres=bandwidth:ib. If we remove the flag, the jobs > run fine. But we need the flag to ensure that we don't get a mix of IB > and ethernet nodes because they fail in this case. This doesn't ans