Hello, I have Slurm 17.11 installed on a 64 cores server. My 9 partitions are set with OverSubscribe=NO. I would expect that when all 64 cores are assigned to jobs, Slurm would just put new jobs in PENDING state. But it starts running new jobs so that more than 64 cores are assigned. Looking at the slurmctld log, we can see that cores 21, 22 and 24 to 38 are used in more than one partition right now:
[2018-04-16T15:00:00.439] node:katak cpus:64 c:8 s:8 t:1 mem:968986 a_mem:231488 state:11 [2018-04-16T15:00:00.439] part:ibismini rows:1 prio:10 [2018-04-16T15:00:00.439] row0: num_jobs 6: bitmap: 4,6-12,16-33,48-55 [2018-04-16T15:00:00.439] part:ibisinter rows:1 prio:10 [2018-04-16T15:00:00.439] row0: num_jobs 1: bitmap: 24-41 [2018-04-16T15:00:00.439] part:ibismax rows:1 prio:10 [2018-04-16T15:00:00.439] row0: num_jobs 3: bitmap: 21-22,24-38,42-47,56-63 [2018-04-16T15:00:00.439] part:rclevesq rows:1 prio:10 [2018-04-16T15:00:00.439] part:ibis1 rows:1 prio:10 [2018-04-16T15:00:00.439] part:ibis2 rows:1 prio:10 [2018-04-16T15:00:00.439] row0: num_jobs 1: bitmap: 32-37 So some jobs are now sharing the same cores but I don't understand why since OverSubscribe is set to no. Thanks for your help! --- Stéphane Larose Analyste de l'informatique Institut de Biologie Intégrative et des Systèmes (IBIS) Pavillon Charles-Eugène-Marchand Université Laval