I have 4 gres gpus called foolsgold that I am trying to allocate, 1-to-a-job. But allocating 1 gpu allocates all gpus to that job, it seems. My batch script is: #!/bin/bash #SBATCH --partition=scavenge #SBATCH --qos=scavenge #SBATCH --account=borrowed #SBATCH --nodes=1 #SBATCH --tasks=1 #SBATCH --time=00:05:20 #SBATCH --gpus=foolsgold:1 date hostname -s for ((i=1;i<=1000000000;i++)) ; do a=$((i++)) ; done date
And the partition definition is: PartitionName=scavtres Nodes=saga-test01,saga-test02 MaxTime=72:00:00 State=UP PriorityTier=0 PreemptMode=REQUEUE AllowQos=scavenge AllowAccounts=borrowed,gaia default=yes TRESBillingWeights="CPU=1.0,Mem=0.25G,GRES/foolsgold=200.0" OverSubscribe=FORCE I have 2 compute nodes in this test cluster, each one with 4 gpus defined: NodeName=saga-test01 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=1800 State=UNKNOWN Gres=gpu:foolsgold:4 NodeName=saga-test02 CPUS=2 SocketsPerBoard=1 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=1800 State=UNKNOWN Gres=gpu:foolsgold:4 The /etc/slurm/gres.conf on the two compute nodes: Name=gpu Type=foolsgold File=/tmp/fg0 Name=gpu Type=foolsgold File=/tmp/fg1 Name=gpu Type=foolsgold File=/tmp/fg2 Name=gpu Type=foolsgold File=/tmp/fg3 How can I get one gpu allocated per job? Thanks, Erik