Would appreciate any leads on the above query. Thanks in advance.

On Fri, 20 Sept 2024 at 14:31, Minulakshmi S <minulakshm...@gmail.com>
wrote:

> Hello,
>
> *Issue 1:*
> I am using slurm version 24.05.1 , my slurmd has a single node where I
> connect multiple gres by enabling the overscribe feature.
> I am able to use the advance reservation of gres only using *gres** name*
>  (tres=gres/gpu:*SYSTEM12*).
>
>
> i.e while in reservation period , if other users submits job with SYSTEM12
> , then slurm places this job in queue
>
> *user1@host$ srun --gres=gpu:SYSTEM12:1 hostname*
> *srun: job 333 queued and waiting for resources *
>
> but when other users just submit a job without any system  name , slurm
> jobs goes through on that gres immediately even though it is reserved.
>
> *user1@host$ srun --gres=gpu:1 hostname
>                     *
> *mylinux.wbi.com <http://mylinux.wbi.com/>             *
>
>
> Also I can see GresUsed in busy mode using "*scontrol show node -d*"   ,
> this means the job is running on Gres/GPU and not on cpu etc.
>
>
> Same way , job submission based on Feature "rev1 in my case" is also going
> through even though it is reserved for other users in multiple partition
> slurm.
>
> *snippet of slurm.conf file*
> NodeName=cluster01 NodeAddr=cluster Port=6002CPUs=8 Boards=1
> SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 Feature="rev1"
> Gres=gpu:SYSTEM12:1 RealMemory=64171 State=IDLE
>
> *Issue 2:*
>
> while execution , Slurm o/p's some extra prints in the srun output
>
> user1@host$ srun --gres=gpu:1 hostname
>
>
> srun: error: extract_net_cred: net_cred not provided
>
> srun: error: Malformed RPC of type RESPONSE_NODE_ALIAS_ADDRS(3017)
> received
>                                               srun: error:
> slurm_unpack_received_msg: [mylinux.wbi.com]:41242] Header lengths are
> longer than data received
> *mylinux.wbi.com <http://mylinux.wbi.com/>*
>
> Regards,
> MS
>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to