Thank you all for the help! The plugin seems to be thing I'm looking for. I'll try to test it with a spare server/GPUs.
Thank again! -- Kamil Wilczek W dniu 04.04.2022 o 09:20, Bas van der Vlies pisze:
We have the exact same request for our GPUS that are not A100 and we have developed a lua plugin for our needs (The new slurm version will also allow the 22.XX). Bu tfor earlier version:* https://github.com/basvandervlies/surf_slurm_mps On 03/04/2022 23:19, Kamil Wilczek wrote:Hello! I am an administrator of a GPU cluster (Slurm version 19.05.5). Could someone help me a little bit and explain if a single GPU can be shared between multiple users? My experience and documentation tells me that it is not possible. But even after some time Slurm is still a beast to me and I find myself struggling :) * I setup the cluster to assign GPUs on multi-GPU servers to different users using GRES. This works fine and several users can work on a multi-GPU machine (--gres=gpu:N/--gpu:N). * But sometimes I have requests to allow a group of students to work simultaneously, interactively on a small partition, where there is more users than GPUs. So I thought that maybe an MPS is a solutions, but the docs says that MPS is a way to run multiple jobs of *the same* user on a single GPU. When another user is requesting a GPU by MPS, the job is enqueued and waiting for the first users' MPS server to finish. So, this is not a solution for a multi-user, simultaneous/parallel environment, right? Is there a way to share a GPU between multiple users? The requirement is, say: * 16 users working interactively, simultaneously * 4 GPUs partition Kind Regards
-- Kamil Wilczek [https://keys.openpgp.org/] [D415917E84B8DA5A60E853B6E676ED061316B69B] Laboratorium Komputerowe Wydział Matematyki, Informatyki i Mechaniki Uniwersytet Warszawski ul. Banacha 2 02-097 Warszawa Tel.: 22 55 44 392 https://www.mimuw.edu.pl/ https://www.uw.edu.pl/
OpenPGP_signature
Description: OpenPGP digital signature