That output of slurmd -C is your answer.

Slurmd only sees 6GB of memory and you are claiming it has 10GB.

I would run some memtests, look at meminfo on the node, etc.

Maybe even check that the type/size of memory in there is what you think it is.

Brian Andrus

On 5/25/2023 7:30 AM, Roger Mason wrote:
Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> writes:

1. Is slurmd running on the node?
Yes.

2. What's the output of "slurmd -C" on the node?
NodeName=node012 CPUs=4 Boards=1 SocketsPerBoard=2 CoresPerSocket=2
ThreadsPerCore=1 RealMemory=6097

3. Define State=UP in slurm.conf in stead of UNKNOWN
Will do.

4. Why have you configured TmpDisk=0?  It should be the size of the
/tmp filesystem.
I have not configured TmpDisk.  This the entry in slurm.conf for that
node:
NodeName=node012 CPUs=4 Boards=1 SocketsPerBoard=2 CoresPerSocket=2
ThreadsPerCore=1 RealMemory=10193  State=UNKNOWN

But I do notice that slurmd -C now says there is less memory than
configured.

Thanks again.

Roger


Reply via email to