Hi I am just building my first Slurm setup and have got everything running - well, almost.
I have a two node configuration. All of my setup exists on a single HyperV server and I have divided up the resources to create my VMs One node I will use for heavy duty work; this is called compute001 One node I will use for normal work; this is called compute002 My compute node specification in slurm.conf is NodeName=DEFAULT CPUs=1 RealMemory=1000 State=UNKNOWN NodeName=compute001 CPUs=32 NodeName=compute002 CPUs=2 The partition specification is PartitionName=DEFAULT State=UP PartitionName=interactive Nodes=compute002 MaxTime=INFINITE OverSubscribe=FORCE PartitionName=simulation Nodes=compute001 MaxTime=30 OverSubscribe=FORCE I have added the OverSubscribe=FORCE option as I want more than one job to be able to land on my interactive/simulation queues. All of the nodes and cluster master start up fine and they all talk to each other but no matter what I do, I cannot get my cluster to accept more than one job per node. Can you help me determine where I am going wrong? Thanks a lot Jake The entire slurm.conf is pasted below # slurm.conf file generated by configurator.html. ClusterName=pm-slurm SlurmctldHost=slurm-master MpiDefault=none ProctrackType=proctrack/cgroup ReturnToService=2 SlurmctldPidFile=/var/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/var/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/var/spool/slurmd SlurmUser=slurm StateSaveLocation=/home/slurm/var/spool/slurmctld SwitchType=switch/none TaskPlugin=task/cgroup # # TIMERS InactiveLimit=0 KillWait=30 MinJobAge=300 SlurmctldTimeout=120 SlurmdTimeout=300 Waittime=0 # # SCHEDULING SchedulerType=sched/backfill SelectType=select/cons_tres SelectTypeParameters=CR_Core_Memory # # LOGGING AND ACCOUNTING JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/cgroup SlurmctldDebug=info SlurmctldLogFile=/var/log/slurmctld.log SlurmdDebug=info SlurmdLogFile=/var/log/slurmd.log # COMPUTE NODES NodeName=DEFAULT CPUs=1 RealMemory=1000 State=UNKNOWN NodeName=compute001 CPUs=32 NodeName=compute002 CPUs=2 PartitionName=DEFAULT State=UP PartitionName=interactive Nodes=compute002 MaxTime=INFINITE OverSubscribe=FORCE PartitionName=simulation Nodes=compute001 MaxTime=30 OverSubscribe=FORCE