Dear Antony, It's worked!
I checked the allocation, and here is the record: Nodes=gpu012 CPU_IDs=0-2 Mem=3072 GRES_IDX=gpu:v100(IDX:0-7) Nodes=gpu013 CPU_IDs=0 Mem=1024 GRES_IDX=gpu:v100(IDX:0-7) The job has got what it applied for. And another question is : how to apply for multiple cards could not be divided exactly by 8? For example, to apply for 10 GPU cards, 8 cards on one node and 2 cards on another node? Thanks a lot again for your kind help. Best regards, Ran On Mon, Apr 15, 2019 at 8:25 PM Ran Du <bella.ran...@gmail.com> wrote: > Dear Antony, > > Thanks a lot for your reply, I tried to submit a job with your > advice, and no more sbatch errors. > > But because our cluster is under maintenance, I have to wait till > tomorrow to see if GPU cards are allocated correctly. I will let you know > as soon as the job is submitted successfully. > > Thanks a lot for your kind help. > > Best regards, > Ran > > On Mon, Apr 15, 2019 at 4:40 PM Antony Cleave <antony.cle...@gmail.com> > wrote: > >> Ask for 8 gpus on 2 nodes instead. >> >> In your script just change the 16 to 8 and it should do what you want. >> >> You are currently asking for 2 nodes with 16 gpu each as Gres resources >> are per node. >> >> Antony >> >> On Mon, 15 Apr 2019, 09:08 Ran Du, <bella.ran...@gmail.com> wrote: >> >>> Dear all, >>> >>> Does anyone know how to set #SBATCH options to get multiple GPU >>> cards from different worker nodes? >>> >>> One of our users would like to apply for 16 NVIDIA V100 cards for >>> his job, and there are 8 GPU cards on each worker node, I have tried the >>> following #SBATCH options: >>> >>> #SBATCH --partition=gpu >>> #SBATCH --qos=normal >>> #SBATCH --account=u07 >>> #SBATCH --job-name=cross >>> #SBATCH --nodes=2 >>> #SBATCH --mem-per-cpu=1024 >>> #SBATCH --output=test.32^4.16gpu.log >>> #SBATCH --gres=gpu:v100:16 >>> >>> but got the sbatch error message : >>> sbatch: error: Batch job submission failed: Requested node >>> configuration is not available >>> >>> And I found a similar question on stack overflow: >>> >>> https://stackoverflow.com/questions/45200926/how-to-access-to-gpus-on-different-nodes-in-a-cluster-with-slurm >>> >>> And it is said that multiple GPU cards allocation on different >>> worker nodes are not available, the post is in 2017, is it still true at >>> present? >>> >>> Thanks a lot for your help. >>> >>> Best regards, >>> Ran >>> >>