date:20240704

[slurm-users] Re: Using sharding

2024-07-04 Thread Ward Poelmans via slurm-users

Hi Ricardo, It should show up like this: Gres=gpu:gtx_1080_ti:4(S:0-1),shard:gtx_1080_ti:16(S:0-1) CfgTRES=cpu=32,mem=515000M,billing=130,gres/gpu=4,gres/shard=16 AllocTRES=cpu=8,mem=31200M,gres/shard=1 I can't directly spot any error however. Our gres.conf is simply `AutoDetect=nvm

[slurm-users] Re: Using sharding

2024-07-04 Thread Brian Andrus via slurm-users

Just a thought. Try specifying some memory. It looks like the running jobs do that and by default, if not specified it is "all the memory on the node", so it can't start because some of it is taken. Brian Andrus On 7/4/2024 9:54 AM, Ricardo Cruz wrote: Dear Brian, Currently, we have 5 GPUs

[slurm-users] Re: Using sharding

2024-07-04 Thread Ricardo Cruz via slurm-users

Dear Brian, Currently, we have 5 GPUs available (out of 8). rpcruz@atlas:~$ /usr/bin/srun --gres=shard:2 ls srun: job 515 queued and waiting for resources The job shows as PD in squeue. scontrol says that 5 GPUs are allocated out of 8... rpcruz@atlas:~$ scontrol show node compute01 NodeName=com

[slurm-users] Re: Using sharding

2024-07-04 Thread Brian Andrus via slurm-users

To help dig into it, can you paste the full output of scontrol show node compute01 while the job is pending? Also 'sinfo' would be good. It is basically telling you there aren't enough resources in the partition to run the job. Often this is because all the nodes are in use at that moment. B

[slurm-users] Using sharding

2024-07-04 Thread Ricardo Cruz via slurm-users

Greetings, There are not many questions regarding GPU sharding here, and I am unsure if I am using it correctly... I have configured it according to the instructions , and it seems to be configured properly: $ scontrol show node compute01 NodeName=compute01 Ar

[slurm-users] Re: Using sharding

[slurm-users] Re: Using sharding

[slurm-users] Re: Using sharding

[slurm-users] Re: Using sharding

[slurm-users] Using sharding

5 matches

Site Navigation

Mail list logo

Footer information