Thanks for your suggestion. You are right, I do not have to deal with specific GPUs. (I have not tried to compile your code, I simply tested two gromacs runs on the same node with -gres=gpu:1 options.)

On 11/13/19 5:17 PM, Renfro, Michael wrote:
Pretty sure you don’t need to explicitly specify GPU IDs on a Gromacs job 
running inside of Slurm with gres=gpu. Gromacs should only see the GPUs you 
have reserved for that job.

Here’s a verification code you can run to verify that two different GPU jobs 
see different GPU devices (compile with nvcc):

=====

// From http://www.cs.fsu.edu/~xyuan/cda5125/examples/lect24/devicequery.cu
#include <stdio.h>
void printDevProp(cudaDeviceProp dP)
{
     printf("%s has %d multiprocessors\n", dP.name, dP.multiProcessorCount);
     printf("%s has PCI BusID %d, DeviceID %d\n", dP.name, dP.pciBusID, 
dP.pciDeviceID);
}
int main()
{
     // Number of CUDA devices
     int devCount; cudaGetDeviceCount(&devCount);
     printf("There are %d CUDA devices.\n", devCount);
     // Iterate through devices
     for (int i = 0; i < devCount; ++i)
     {
         // Get device properties
         printf("CUDA Device #%d: ", i);
         cudaDeviceProp devProp; cudaGetDeviceProperties(&devProp, i);
         printDevProp(devProp);
     }
     return 0;
}

=====

When run from two simultaneous jobs on the same node (each with a gres=gpu), I 
get:

=====

[renfro@gpunode003(job 221584) hw]$ ./cuda_props
There are 1 CUDA devices.
CUDA Device #0: Tesla K80 has 13 multiprocessors
Tesla K80 has PCI BusID 5, DeviceID 0

=====

[renfro@gpunode003(job 221585) hw]$ ./cuda_props
There are 1 CUDA devices.
CUDA Device #0: Tesla K80 has 13 multiprocessors
Tesla K80 has PCI BusID 6, DeviceID 0

=====


--
Tamas Hegedus, PhD
Senior Research Fellow
Department of Biophysics and Radiation Biology
Semmelweis University     | phone: (36) 1-459 1500/60233
Tuzolto utca 37-47        | mailto:ta...@hegelab.org
Budapest, 1094, Hungary   | http://www.hegelab.org


Reply via email to