I think it's because hostname is so undemanding. How many CPUs does each host have?
You may need to use ((number of cpus per host) + 1) to see action on another node. You can try using stress-ng to test higher loads? https://www.cyberciti.biz/faq/stress-test-linux-unix-server-with-stress-ng/ cheers L. ------ "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics is the insistence that we cannot ignore the truth, nor should we panic about it. It is a shared consciousness that our institutions have failed and our ecosystem is collapsing, yet we are still here — and we are creative agents who can shape our destinies. Apocalyptic civics is the conviction that the only way out is through, and the only way through is together. " *Greg Bloom* @greggish https://twitter.com/greggish/status/873177525903609857 On 28 July 2017 at 10:28, 허웅 <[email protected]> wrote: > I have 5 nodes include control node. > > and my nodes are looking like this > > Control Node : GO1 > Compute Nodes : GO[1-5] > > when i trying to allocate some job to multiple nodes, only one node works. > > example] > > $ srun -N5 hostname > GO1 > GO1 > GO1 > GO1 > GO1 > > even I expected like this > > $ srun -N5 hostname > GO1 > GO2 > GO3 > GO4 > GO5 > > What should i do? > > there are some my configures. > > $ scontrol show frontend > FrontendName=GO1 State=IDLE Version=17.02 Reason=(null) > BootTime=2017-06-02T20:14:39 SlurmdStartTime=2017-07-27T16:29:46 > > FrontendName=GO2 State=IDLE Version=17.02 Reason=(null) > BootTime=2017-07-05T17:54:13 SlurmdStartTime=2017-07-27T16:30:07 > > FrontendName=GO3 State=IDLE Version=17.02 Reason=(null) > BootTime=2017-07-05T17:22:58 SlurmdStartTime=2017-07-27T16:30:08 > > FrontendName=GO4 State=IDLE Version=17.02 Reason=(null) > BootTime=2017-07-05T17:21:40 SlurmdStartTime=2017-07-27T16:30:08 > > FrontendName=GO5 State=IDLE Version=17.02 Reason=(null) > BootTime=2017-07-05T17:21:39 SlurmdStartTime=2017-07-27T16:30:09 > > $ scontrol ping > Slurmctld(primary/backup) at GO1/(NULL) are UP/DOWN > > [slurm.conf] > # slurm.conf > # > # See the slurm.conf man page for more information. > # > ClusterName=linux > ControlMachine=GO1 > ControlAddr=192.168.30.74 > # > SlurmUser=slurm > SlurmctldPort=6817 > SlurmdPort=6818 > AuthType=auth/munge > StateSaveLocation=/var/lib/slurmd > SlurmdSpoolDir=/var/spool/slurmd > SwitchType=switch/none > MpiDefault=none > SlurmctldPidFile=/var/run/slurmd/slurmctld.pid > SlurmdPidFile=/var/run/slurmd/slurmd.pid > ProctrackType=proctrack/pgid > ReturnToService=0 > TreeWidth=50 > # > # TIMERS > SlurmctldTimeout=300 > SlurmdTimeout=300 > InactiveLimit=0 > MinJobAge=300 > KillWait=30 > Waittime=0 > # > # SCHEDULING > SchedulerType=sched/backfill > FastSchedule=1 > # > # LOGGING > SlurmctldDebug=7 > SlurmctldLogFile=/var/log/slurmctld.log > SlurmdDebug=7 > SlurmdLogFile=/var/log/slurmd.log > JobCompType=jobcomp/none > # > # COMPUTE NODES > NodeName=sgo[1-5] NodeHostName=GO[1-5] #NodeAddr=192.168.30.[74,141,68,70,72] > > # > # PARTITIONS > PartitionName=party Default=yes Nodes=ALL >
