slurm.conf contains the following:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
AccountingStorageTRES=gres/gpu
Could this be constraining CgfTRES=cpu=16 somehow?
David Guertin
From: Guertin, David S.
Sent: Wednesday, April 6, 2022 12:27 PM
To: Slurm
No, the user is submitting four jobs, each requesting 1/4 of the memory and 1/4
of the CPUs (i.e. 8 out of 32). But even though there are 32 physical cores,
Slurm only shows 16 as trackable resources:
>From scontrol show node node020:
CfgTRES=cpu=16,mem=257600M,billing=16,gres/gpu=4
Why would
Thanks. That shows 32 cores, as expected:
# /cm/shared/apps/slurm/19.05.8/sbin/slurmd -C
NodeName=node020 CPUs=32 Boards=1 SocketsPerBoard=2 CoresPerSocket=16
ThreadsPerCore=1 RealMemory=257600
UpTime=0-22:39:36
But I can't understand why when users submit jobs, the node is only allocating
16.
We've added a new GPU node to our cluster with 32 cores. It contains 2 16-core
sockets, and hyperthreading is turned off, so the total is 32 cores. But jobs
are only being allowed to use 16 cores.
Here's the relevant line from slurm.conf:
NodeName=node020 CoresPerSocket=16 RealMemory=257600 Thr
again.
Best regards,
Taras
On Fri, Aug 9, 2019 at 11:54 PM Guertin, David S.
mailto:guer...@middlebury.edu>> wrote:
What's even stranger is that I can change CoreSpecCount to any other number (2,
3, whatever), restart the daemons, and the change is made. But if I try to set
it to 0
no idea what that could be.
Dave
David Guertin
Information Technology Services
Middlebury College
700 Exchange St.
Middlebury, VT 05753
(802)443-3143
From: slurm-users on behalf of Guertin,
David S.
Sent: Friday, August 9, 2019 4:28 PM
To: Slurm User Community
> Have you restarted all your slurm daemons?
Yes, I have done that on every node, but it still shows one specialized core.
> Not sure whether "scontrol reconfigure" can deal with that change.
I tried "scontrol reconfigure", but it also had no effect.
Thanks,
Dave
I was doing some testing with core specialization
(https://slurm.schedmd.com/core_spec.html) before deciding that I don't want to
have it enabled, so I'm trying to turn it off. The problem is that on most of
the nodes in my cluster, I can't disable it.
I've removed the CoreSpecCount and CPUSpec
Hello all,
I'm trying to turn off core specialization in my cluster by setting
CoreSpecCount=0, but checking with scontrol does not show my changes. If I set
CoreSpec=1 or CoreSpecCount=2, or anything except 0, the changes are applied
correctly. But when I set it to 0, no change is applied --