Hi All, I have recently set up a slurm cluster with my servers and I'm running into an issue while submitting GPU jobs. It has something to to with gres configurations, but I just can't seem to figure out what is wrong. Non GPU jobs run fine.
The error is as follows: sbatch: error: Batch job submission failed: Invalid Trackable RESource (TRES) specification after submitting a batch job. My batch job is as follows: #!/bin/bash #SBATCH --partition=tiger_1 # partition name #SBATCH --gres=gpu:k20:1 #SBATCH --gres-flags=enforce-binding #SBATCH --time=0:20:00 # wall clock limit #SBATCH --output=gpu-%J.txt #SBATCH --account=lnicotra module load cuda python gpu1 Where gpu1 is a GPU test script that runs correctly while invoked via python. Tiger_1 partition has servers with GPUs, with a mix of 1080GTX and K20 as specified in slurm.conf I have defined GRES resources in the slurm.conf file: # GPU GRES GresTypes=gpu NodeName=tiger[01,05,10,15,20] Gres=gpu:1080gtx:2 NodeName=tiger[02-04,06-09,11-14,16-19,21-22] Gres=gpu:k20:2 And have a local gres.conf on the servers containing GPUs... lnicotra@tiger11 ~# cat /etc/slurm/gres.conf # GPU Definitions # NodeName=tiger[02-04,06-09,11-14,16-19,21-22] Name=gpu Type=K20 File=/dev/nvidia[0-1] Name=gpu Type=K20 File=/dev/nvidia[0-1] Cores=0,1 and a similar one for the 1080GTX # GPU Definitions # NodeName=tiger[01,05,10,15,20] Name=gpu Type=1080GTX File=/dev/nvidia[0-1] Name=gpu Type=1080GTX File=/dev/nvidia[0-1] Cores=0,1 The account manager seems to know about the GPUs... lnicotra@tiger11 ~# sacctmgr show tres Type Name ID -------- --------------- ------ cpu 1 mem 2 energy 3 node 4 billing 5 fs disk 6 vmem 7 pages 8 gres gpu 1001 gres gpu:k20 1002 gres gpu:1080gtx 1003 Can anyone point out what am I missing? Thanks! Lou -- *Lou Nicotra* IT Systems Engineer - SLT Interactions LLC o: 908-673-1833 <781-405-5114> m: 908-451-6983 <781-405-5114> *lnico...@interactions.com <lnico...@interactions.com>* www.interactions.com -- ******************************************************************************* This e-mail and any of its attachments may contain Interactions LLC proprietary information, which is privileged, confidential, or subject to copyright belonging to the Interactions LLC. This e-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately and permanently delete the original and any copy of this e-mail and any printout. Thank You. *******************************************************************************