On Tue, Mar 19, 2019 at 8:34 AM Peter Steinbach <stein...@mpi-cbg.de> wrote: > > Hi, > > we are struggling with a slurm 18.08.5 installation of ours. We are in a > situation, where our GPU nodes have a considerable number of cores but > "only" 2 GPUs inside. While people run jobs using the GPUs, non-GPU jobs > can enter alright. However, we found out the hard way, that the inverse > is not true. > > For example, let's say I have a 4-core GPU node called gpu1. A non-GPU job > $ sbatch --wrap="sleep 10 && hostname" -c 3 > comes in and starts running on gpu1. > We observed that the job produced by the following command targetting > the same node: > $ sbatch --wrap="hostname" -c 1 --gres=gpu:1 -w gpu1 > will wait indefinitely for available resources until the non-gpu job is > finished. This is not something we want. > > The sample gres.conf and slurm.conf from a docker based slurm cluster > where I was able to reproduce the issue are available here: > https://raw.githubusercontent.com/psteinb/docker-centos7-slurm/18.08.5-with-gres/slurm.conf > https://raw.githubusercontent.com/psteinb/docker-centos7-slurm/18.08.5-with-gres/gres.conf > > We are not sure how to handle the situation as we would like both jobs > to enter the gpu node and run at the same time to maximize the utility > of our hardware to our users. > > Any hints or ideas are highly appreciated. > Thanks for your help, > Peter >
You don't mention your slurm's SelectTypeParameters, which by default I believe will schedule whole nodes to jobs. We have CR_LLN set in ours which will spread jobs across nodes instead. My memory is a little foggy on the details but there is a lot of configuration possible via SelectType, SelectTypeParameters and the Schedular. Read up on them in the slurm.conf man or the schedmd website.