Thank you for the support. I will be back with any additional questions. BTW, if it changes or adds to your thoughts, I'm working in AWS on a parallelcluster.

Hoot

On 1/21/22 4:12 AM, Ole Holm Nielsen wrote:
On 1/21/22 10:05, Diego Zuccato wrote:
Il 21/01/2022 07:51, Ole Holm Nielsen ha scritto:

There's a nice command to run on any given node which tells you slurmd's view of the node:
$ slurmd -C
NodeName=i004 CPUs=16 Boards=1 SocketsPerBoard=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=128691
UpTime=36-17:32:44
Here's an example that I use:
NodeName=i[004-030] Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=128000
@Hoot: note that the line in slurm.conf uses a lower value for RealMemory. This way you'll avoid having nodes going offline after an upgrade because the new kernel leaves a couple free MBs less than older one (been there, done that... :( ).

@Diego: Thanks for reminding me of this!  I've added this information to my Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#compute-node-configuration

/Ole


Reply via email to