[slurm-users] SLURM GRES reservation not working properly on 24.05.1

2024-09-20 Thread Minulakshmi S via slurm-users
Hello, *Issue 1:* I am using slurm version 24.05.1 , my slurmd has a single node where I connect multiple gres by enabling the overscribe feature. I am able to use the advance reservation of gres only using *gres** name* (tres=gres/gpu:*SYSTEM12*). i.e while in reservation period , if other use

[slurm-users] what updates NODEADDR

2024-09-20 Thread Jakub Szarlat via slurm-users
Hi I'm using dynamic nodes with "slurmd -Z" with slurm 23.11.1. Firstly I find that when you do "scontrol show node" it shows the NODEADDR as ip rather than the NODENAME. Because I'm playing around with running this in containers on docker swarm I find this ip can be wrong. I can force it with

[slurm-users] SLURM GRES reservation not working properly on 24.05.1

2024-09-20 Thread Minulakshmi S via slurm-users
Hello, *Issue 1:* I am using slurm version 24.05.1 , my slurmd has a single node where I connect multiple gres by enabling the overscribe feature. I am able to use the advance reservation of gres only using *gres** name* (tres=gres/gpu:*SYSTEM12*). i.e while in reservation period , if other user

[slurm-users] Re: Can't schedule on cloud node: State=IDLE+CLOUD+POWERED_DOWN+NOT_RESPONDING

2024-09-20 Thread Xaver Stiensmeier via slurm-users
Hey Nate, we actually fixed our underlying issue that caused the NOT_RESPONDING flag - on fails we automatically terminated the node manually instead of letting Slurm call the terminate script. That lead to Slurm believing the node should still be there when it was terminated already. Therefore,