Thanks for your suggestion. You are right, I do not have to deal with
specific GPUs.
(I have not tried to compile your code, I simply tested two gromacs runs
on the same node with -gres=gpu:1 options.)
On 11/13/19 5:17 PM, Renfro, Michael wrote:
Pretty sure you don’t need to explicitly specify GPU IDs on a Gromacs job
running inside of Slurm with gres=gpu. Gromacs should only see the GPUs you
have reserved for that job.
Here’s a verification code you can run to verify that two different GPU jobs
see different GPU devices (compile with nvcc):
=====
// From http://www.cs.fsu.edu/~xyuan/cda5125/examples/lect24/devicequery.cu
#include <stdio.h>
void printDevProp(cudaDeviceProp dP)
{
printf("%s has %d multiprocessors\n", dP.name, dP.multiProcessorCount);
printf("%s has PCI BusID %d, DeviceID %d\n", dP.name, dP.pciBusID,
dP.pciDeviceID);
}
int main()
{
// Number of CUDA devices
int devCount; cudaGetDeviceCount(&devCount);
printf("There are %d CUDA devices.\n", devCount);
// Iterate through devices
for (int i = 0; i < devCount; ++i)
{
// Get device properties
printf("CUDA Device #%d: ", i);
cudaDeviceProp devProp; cudaGetDeviceProperties(&devProp, i);
printDevProp(devProp);
}
return 0;
}
=====
When run from two simultaneous jobs on the same node (each with a gres=gpu), I
get:
=====
[renfro@gpunode003(job 221584) hw]$ ./cuda_props
There are 1 CUDA devices.
CUDA Device #0: Tesla K80 has 13 multiprocessors
Tesla K80 has PCI BusID 5, DeviceID 0
=====
[renfro@gpunode003(job 221585) hw]$ ./cuda_props
There are 1 CUDA devices.
CUDA Device #0: Tesla K80 has 13 multiprocessors
Tesla K80 has PCI BusID 6, DeviceID 0
=====
--
Tamas Hegedus, PhD
Senior Research Fellow
Department of Biophysics and Radiation Biology
Semmelweis University | phone: (36) 1-459 1500/60233
Tuzolto utca 37-47 | mailto:ta...@hegelab.org
Budapest, 1094, Hungary | http://www.hegelab.org