Do you have that resource handy? I looked into the cgroups documentation but I see very little on tutorials for modifying the permissions.
On Mon, May 20, 2019 at 2:45 AM John Hearns <hear...@googlemail.com> wrote: > Two replies here. > First off for normal user logins you can direct them into a cgroup - I > looked into this about a year ago and it was actually quite easy. > As I remember there is a service or utility available which does just > that. Of course the user cgroup would not have > > Expanding on my theme, it is probably a good idea then to have all the > system processes contained in a 'boot cpuset' - is at system boot time > allocate a small number of cores to the system dacemons, Slurm processes > and probably the user login sessions. > Thus freeing up the other CPUs for batch jobs exclusively. > > Also you could try simply setting CUDA_VISIBLE_DEVICES to Null in one of > the system wide login scripts, > > > > > > > > On Mon, 20 May 2019 at 08:38, Nathan Harper <nathan.har...@cfms.org.uk> > wrote: > >> This doesn't directly answer your question, but in Feb last year on the >> ML there was a discussion about limiting user resources on login node >> (Stopping compute usage on login nodes). Some of the suggestions >> included the use of cgroups to do so, and it's possible that those methods >> could be extended to limit access to GPUs, so it might be worth looking >> into. >> >> On Sat, 18 May 2019 at 00:28, Dave Evans <rdev...@ece.ubc.ca> wrote: >> >>> >>> We are using a single system "cluster" and want some control of fair-use >>> with the GPUs. The sers are not supposed to be able to use the GPUs until >>> they have allocated the resources through slurm. We have no head node, so >>> slurmctld, slurmdbd, and slurmd are all run on the same system. >>> >>> I have a configuration working now such that the GPUs can be scheduled >>> and allocated. >>> However logging into the system before allocating GPUs gives full access >>> to all of them. >>> >>> I would like to configure slurm cgroups to disable access to GPUs until >>> they have been allocated. >>> >>> On first login, I get: >>> nvidia-smi -q | grep UUID >>> GPU UUID : >>> GPU-6076ce0a-bc03-a53c-6616-0fc727801c27 >>> GPU UUID : >>> GPU-5620ec48-7d76-0398-9cc1-f1fa661274f3 >>> GPU UUID : >>> GPU-176d0514-0cf0-df71-e298-72d15f6dcd7f >>> GPU UUID : >>> GPU-af03c80f-6834-cb8c-3133-2f645975f330 >>> GPU UUID : >>> GPU-ef10d039-a432-1ac1-84cf-3bb79561c0d3 >>> GPU UUID : >>> GPU-38168510-c356-33c9-7189-4e74b5a1d333 >>> GPU UUID : >>> GPU-3428f78d-ae91-9a74-bcd6-8e301c108156 >>> GPU UUID : >>> GPU-c0a831c0-78d6-44ec-30dd-9ef5874059a5 >>> >>> >>> And running from the queue: >>> srun -N 1 --gres=gpu:2 nvidia-smi -q | grep UUID >>> GPU UUID : >>> GPU-6076ce0a-bc03-a53c-6616-0fc727801c27 >>> GPU UUID : >>> GPU-5620ec48-7d76-0398-9cc1-f1fa661274f3 >>> >>> >>> Pastes of my config files are: >>> ## slurm.conf ## >>> https://pastebin.com/UxP67cA8 >>> >>> >>> *## cgroup.conf ##* >>> CgroupAutomount=yes >>> CgroupReleaseAgentDir="/etc/slurm/cgroup" >>> >>> ConstrainCores=yes >>> ConstrainDevices=yes >>> ConstrainRAMSpace=yes >>> #TaskAffinity=yes >>> >>> *## cgroup_allowed_devices_file.conf ## * >>> /dev/null >>> /dev/urandom >>> /dev/zero >>> /dev/sda* >>> /dev/cpu/*/* >>> /dev/pts/* >>> /dev/nvidia* >>> >> >> >> -- >> *Nathan Harper* // IT Systems Lead >> >> *e: *nathan.har...@cfms.org.uk *t*: 0117 906 1104 *m*: 0787 551 0891 >> *w: *www.cfms.org.uk >> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // >> Emersons >> Green // Bristol // BS16 7FR >> >> CFMS Services Ltd is registered in England and Wales No 05742022 - a >> subsidiary of CFMS Ltd >> CFMS Services Ltd registered office // 43 Queens Square // Bristol // >> BS1 4QP >> >