Dear Marcus, Thanks a lot for your reply. I will write it into our User Manual, and let users know how to apply for multiple GPU cards.
Best regards, Ran On Tue, Apr 16, 2019 at 5:40 PM Marcus Wagner <wag...@itc.rwth-aachen.de> wrote: > Dear Ran, > > you can only ask for GPUS PER NODE, as gres are ressources per node. > > So, you can ask for 5 gpus and then get 5 gpus on each of the two nodes. > At the moment it is not possible to ask for 8 gpus on one node and 2 on > another. > That MIGHT change with slurm 19.05, since SchedMD is overhauling besides > pother things the gpu handling within slurm. > > > Best > Marcus > > On 4/16/19 9:15 AM, Ran Du wrote: > > Dear Antony, > > It's worked! > > I checked the allocation, and here is the record: > > Nodes=gpu012 CPU_IDs=0-2 Mem=3072 GRES_IDX=gpu:v100(IDX:0-7) > Nodes=gpu013 CPU_IDs=0 Mem=1024 GRES_IDX=gpu:v100(IDX:0-7) > > The job has got what it applied for. > > And another question is : how to apply for multiple cards could not > be divided exactly by 8? For example, to apply for 10 GPU cards, 8 cards on > one node and 2 cards on another node? > > Thanks a lot again for your kind help. > > Best regards, > Ran > > > On Mon, Apr 15, 2019 at 8:25 PM Ran Du <bella.ran...@gmail.com> wrote: > >> Dear Antony, >> >> Thanks a lot for your reply, I tried to submit a job with your >> advice, and no more sbatch errors. >> >> But because our cluster is under maintenance, I have to wait till >> tomorrow to see if GPU cards are allocated correctly. I will let you know >> as soon as the job is submitted successfully. >> >> Thanks a lot for your kind help. >> >> Best regards, >> Ran >> >> On Mon, Apr 15, 2019 at 4:40 PM Antony Cleave <antony.cle...@gmail.com> >> wrote: >> >>> Ask for 8 gpus on 2 nodes instead. >>> >>> In your script just change the 16 to 8 and it should do what you want. >>> >>> You are currently asking for 2 nodes with 16 gpu each as Gres resources >>> are per node. >>> >>> Antony >>> >>> On Mon, 15 Apr 2019, 09:08 Ran Du, <bella.ran...@gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> Does anyone know how to set #SBATCH options to get multiple GPU >>>> cards from different worker nodes? >>>> >>>> One of our users would like to apply for 16 NVIDIA V100 cards for >>>> his job, and there are 8 GPU cards on each worker node, I have tried the >>>> following #SBATCH options: >>>> >>>> #SBATCH --partition=gpu >>>> #SBATCH --qos=normal >>>> #SBATCH --account=u07 >>>> #SBATCH --job-name=cross >>>> #SBATCH --nodes=2 >>>> #SBATCH --mem-per-cpu=1024 >>>> #SBATCH --output=test.32^4.16gpu.log >>>> #SBATCH --gres=gpu:v100:16 >>>> >>>> but got the sbatch error message : >>>> sbatch: error: Batch job submission failed: Requested node >>>> configuration is not available >>>> >>>> And I found a similar question on stack overflow: >>>> >>>> https://stackoverflow.com/questions/45200926/how-to-access-to-gpus-on-different-nodes-in-a-cluster-with-slurm >>>> >>>> And it is said that multiple GPU cards allocation on different >>>> worker nodes are not available, the post is in 2017, is it still true at >>>> present? >>>> >>>> Thanks a lot for your help. >>>> >>>> Best regards, >>>> Ran >>>> >>> > -- > Marcus Wagner, Dipl.-Inf. > > IT Center > Abteilung: Systeme und Betrieb > RWTH Aachen University > Seffenter Weg 23 > 52074 Aachen > Tel: +49 241 80-24383 > Fax: +49 241 80-624383wag...@itc.rwth-aachen.dewww.itc.rwth-aachen.de > >