Someone else might see more than I do, but from what you’ve posted, it’s clear that compute-0-0 will be used only after other lower-weighted nodes are too full to accept a particular job.
I assume you’ve already submitted a set of jobs requesting enough resources to fill up all the nodes, and the some jobs stay in a pending state instead of using compute-0-0, which sits idle? > On Apr 19, 2020, at 1:10 PM, Mahmood Naderan <mahmood...@gmail.com> wrote: > > Hi, > Although compute-0-0 is included in a partition, I have noticed that > no job is offloaded there automatically. If someone intentionally > write --nodelist=compute-0-0 it will be fine. > > # grep -r compute-0-0 . > ./nodenames.conf.new:NodeName=compute-0-0 NodeAddr=10.1.1.254 CPUs=32 > Weight=20511900 Feature=rack-0,32CPUs > ./node.conf:NodeName=compute-0-0 NodeAddr=10.1.1.254 CPUs=32 > Weight=20511900 Feature=rack-0,32CPUs > ./nodenames.conf.new4:NodeName=compute-0-0 NodeAddr=10.1.1.254 CPUs=32 > Weight=20511900 Feature=rack-0,32CPUs > # grep -r compute-0-1 . > ./nodenames.conf.new:NodeName=compute-0-1 NodeAddr=10.1.1.253 CPUs=32 > Weight=20511899 Feature=rack-0,32CPUs > ./node.conf:NodeName=compute-0-1 NodeAddr=10.1.1.253 CPUs=32 > Weight=20511899 Feature=rack-0,32CPUs > ./nodenames.conf.new4:NodeName=compute-0-1 NodeAddr=10.1.1.253 CPUs=32 > Weight=20511899 Feature=rack-0,32CPUs > # cat parts > PartitionName=WHEEL RootOnly=yes Priority=1000 Nodes=ALL > PartitionName=SEA AllowAccounts=fish Nodes=ALL > # scontrol show node compute-0-0 > NodeName=compute-0-0 Arch=x86_64 CoresPerSocket=1 > CPUAlloc=0 CPUTot=32 CPULoad=0.01 > AvailableFeatures=rack-0,32CPUs > ActiveFeatures=rack-0,32CPUs > Gres=(null) > NodeAddr=10.1.1.254 NodeHostName=compute-0-0 > OS=Linux 3.10.0-1062.1.2.el7.x86_64 #1 SMP Mon Sep 30 14:19:46 UTC 2019 > RealMemory=64259 AllocMem=0 FreeMem=63421 Sockets=32 Boards=1 > State=IDLE ThreadsPerCore=1 TmpDisk=444124 Weight=20511900 > Owner=N/A MCS_label=N/A > Partitions=CLUSTER,WHEEL,SEA > BootTime=2020-04-18T10:30:07 SlurmdStartTime=2020-04-19T22:32:12 > CfgTRES=cpu=32,mem=64259M,billing=47 > AllocTRES= > CapWatts=n/a > CurrentWatts=0 AveWatts=0 > ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s > > # squeue > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > 436 SEA relax13 raz R 21:44:22 3 > compute-0-[1-2],hpc > 435 SEA 261660mo abb R 1-05:19:31 3 > compute-0-[1-2],hpc > > Compute-0-0 is idle. So, why slurm decided to put those jobs on other nodes? > Any idea for debugging? > > > Regards, > Mahmood >