Hi, my Slurm cluster has a dozen machines configured as follows:
NodeName=foobar01 CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2 RealMemory=257243 State=UNKNOWN and scheduling is: # SCHEDULING SchedulerType=sched/backfill SelectType=select/cons_tres SelectTypeParameters=CR_Core My problem is that only half of the logical cores are used when I run a computation. Let me explain: I use R and the package 'batchtools' to create jobs. All the jobs are created under the hood with sbatch. If I log in to all the machines in my cluster and do a 'htop', I can see that only half of the logical cores are used. Other methods to measure the load of each machine confirmed this "visual" clue. My jobs ask Slurm for only one cpu per task. I tried to enforce that with the -c 1 but it didn't make any difference. Then I realized there was something strange: when I do scontrol show job <jobid>, I can spot the following output: NumNodes=1 NumCPUs=2 NumTasks=0 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=2,node=1,billing=2 Socks/Node=* NtasksPerN:B:S:C=0:0:*:2 CoreSpec=* that is each job uses NumCPUs=2 instead of 1. Also, I'm not sure why TRES=cpu=2 Any idea on how to solve this problem and have 100% of the logical cores allocated? Best regards, David