Here it goes: ##################BEGIN SLURM.CONF####################### ClusterName=foner ControlMachine=foner1,foner2 ControlAddr=slurm-server #BackupController= #BackupAddr= # SlurmUser=slurm #SlurmdUser=root SlurmctldPort=6817 SlurmdPort=6818 AuthType=auth/munge CryptoType=crypto/munge JobCredentialPrivateKey=/etc/slurm/private.key JobCredentialPublicCertificate=/etc/slurm/public.key StateSaveLocation=/SLURM SlurmdSpoolDir=/var/log/slurm/spool_slurmd// SwitchType=switch/none MpiDefault=none SlurmctldPidFile=/var/run/slurm/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid #ProctrackType=proctrack/pgid ProctrackType=proctrack/linuxproc TaskPlugin=task/affinity TaskPluginParam=Cpusets #PluginDir= CacheGroups=0 #FirstJobId= ReturnToService=0 #MaxJobCount= #PlugStackConfig= #PropagatePrioProcess= #PropagateResourceLimits= #PropagateResourceLimitsExcept= #Prolog=/data/scripts/prolog_ctld.sh #Prolog= Epilog=/data/scripts/epilog.sh #SrunProlog= #SrunEpilog= #TaskProlog= #TaskEpilog= #TaskPlugin= #TrackWCKey=no #TreeWidth=50 #TmpFS= #UsePAM= #UsePAM=1 # # TIMERS SlurmctldTimeout=300 SlurmdTimeout=300 InactiveLimit=0 MinJobAge=300 KillWait=30 Waittime=0 # # SCHEDULING SchedulerType=sched/backfill #SchedulerAuth= #SchedulerPort= #SchedulerRootFilter= #SelectType=select/linear SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK FastSchedule=1 PriorityType=priority/multifactor #PriorityDecayHalfLife=14-0 #PriorityUsageResetPeriod=14-0 PriorityWeightFairshare=0 PriorityWeightAge=0 PriorityWeightPartition=0 PriorityWeightJobSize=0 PriorityWeightQOS=1000 #PriorityMaxAge=1-0 # # LOGGING SlurmctldDebug=5 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=5 SlurmdLogFile=/var/log/slurm/slurmd.log JobCompType=jobcomp/none #JobCompLoc= # # ACCOUNTING #JobAcctGatherType=jobacct_gather/linux #JobAcctGatherFrequency=30 # #AccountingStorageType=accounting_storage/slurmdbd ##AccountingStorageHost=slurm-server #AccountingStorageLoc= #AccountingStoragePass= #AccountingStorageUser= #
AccountingStorageEnforce=qos AccountingStorageLoc=slurm_acct_db AccountingStorageType=accounting_storage/slurmdbd AccountingStoragePort=8544 AccountingStorageUser=root #AccountingStoragePass=slurm AccountingStorageHost=slurm-server # ACCT_GATHER JobAcctGatherType=jobacct_gather/linux JobAcctGatherFrequency=60 #AcctGatherEnergyType=acct_gather_energy/rapl #AcctGatherNodeFreq=30 #Memoria #DefMemPerCPU=1024 # 1GB #MaxMemPerCPU=3072 # 3GB # COMPUTE NODES NodeName=foner[11-14] Procs=20 RealMemory= 258126 Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 State=UNKNOWN NodeName=foner[101-142] CPUs=20 Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 RealMemory=64398 State=UNKNOWN PartitionName=thin Nodes=foner[103-142] Shared=NO PreemptMode=CANCEL State=UP MaxTime=4320 MinNodes=2 PartitionName=thin_test Nodes=foner[101,102] Default=YES Shared=NO PreemptMode=CANCEL State=UP MaxTime=60 MaxNodes=1 PartitionName=fat Nodes=foner[11-14] Shared=NO PreemptMode=CANCEL State=UP MaxTime=4320 MaxNodes=1 ##################END SLURM.CONF####################### On 07/04/14 17:40, Mehdi Denou wrote: Could you provide us the slurm.conf ? On 04/04/2014 14:46, Joan Arbona wrote: Doesn't work either. I also tried with -m block:block with no luck... On 04/04/14 14:13, Mehdi Denou wrote: Of course, -N 1 is wrong since you request more cpu than available on 1 node. I didn't read your mail to the end sorry. try with: -n 25 -m plane=20 On 04/04/2014 13:57, Joan Arbona wrote: Not working, just says that cannot use more nodes than requested: srun: error: Unable to create job step: More processors requested than permitted Thanks On 04/04/14 13:50, Mehdi Denou wrote: Try with: srun -N 1 -n 25 On 04/04/2014 13:47, Joan Arbona wrote: Excuse me, I confused "Nodes" with "Tasks". When I wrote "Nodes" in the last e-mail I meant "tasks". Let me explain it again with an example: My cluster has 2 nodes with 20 processors/node. I want to allocate all 40 processors and both nodes in sbatch. Then I have to execute a jobstep with srun on a subset of 25 processors. I want SLURM to fill completely the maximum number of nodes: That is, using all 20 processors of the first node and 5 of the second one. If I execute an sbatch like this: #!/bin/bash [...] #SBATCH --nodes=2 #SBATCH --ntasks=40 srun -n25 hostname Does not work and executes 12 hostname on the first node and 13 on the second one, and should execute 20 hostname on the first one and 5 on the second one. Thanks and sorry for the confusion, Joan On 04/04/14 13:22, Mehdi Denou wrote: It's a little bit confusing: When in sbatch I specify that I want to allocate 25 nodes and I execute So it means -N 25 For example if you want to allocate 40 nodes and then execute srun on 25: #!/bin/bash #SBATCH -N 40 srun -N 25 hostname -n is the number of task (the number of system process) -N or --nodes is the number of nodes. If you don't specify -n it's set to 1 by default. On 04/04/2014 11:24, Joan Arbona wrote: Thanks for the answer. No luck anyway. When in sbatch I specify that I want to allocate 25 nodes and I execute srun without parameters it works. However, if I specify I want to allocate 40 nodes and then I execute srun selecting only 25 of them it does not work. That is: --- 1. #!/bin/bash [...] #SBATCH --nodes=2 #SBATCH --ntasks=25 srun hostname -> Works, but we don't want it because we need srun to select a subset of the requested nodes. --- 2. #!/bin/bash [...] #SBATCH --nodes=2 #SBATCH --ntasks=40 srun -n25 hostname -> Doesn't work. Executes half of the processes on the first node and the other half on the second. Also tried to remove --nodes=2. --- It seems that it's the way sbatch influences srun. Is there anyway to see which parameters does the sbatch call transfers to srun? Thanks, Joan On 04/04/14 10:54, Mehdi Denou wrote: Hello, You should take a look at the parameter --mincpu On 04/04/2014 10:22, Joan Arbona wrote: Hello all, We have a cluster with 40 nodes and 20 cores for node and we are trying to distribute jobsteps executed with sbatch "in blocks". That means we want to fill the maximum number of nodes and, if the number of tasks is not multiple of 20, to have only one node without all cores busy. For example, if we executed a task on 25 cores, we would have node 1 with all 20 cores reserved and node 2 with only 5 cores reserved. If we execute srun -n25 -pthin hostname works fine and produces the following output: foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner118 foner119 foner119 foner119 foner119 foner119 However, when we execute this in a sbatch script it does not work at all. I have tried it with all possible configurations I know and with all useful parameters. Instead it executes 13 processes on the first node and 12 processes on the second node. This is our sbatch script: #!/bin/bash #SBATCH --job-name=prova_joan #SBATCH --partition=thin #SBATCH --output=WRFJobName-%j.out #SBATCH --error=WRFJobName-%j.err #SBATCH --nodes=2 #SBATCH --ntasks=40 srun -n25 --exclusive hostname & wait I have already tried to remove the --exclusive and the & without success. To sum up, the question is: What's the way to group tasks of jobsteps so they fill as many nodes as possible with sbatch? Thanks, Joan PS: Attaching slurm.conf: ##################BEGIN SLURM.CONF####################### ClusterName=foner ControlMachine=foner1,foner2 ControlAddr=slurm-server #BackupController= #BackupAddr= # SlurmUser=slurm #SlurmdUser=root SlurmctldPort=6817 SlurmdPort=6818 AuthType=auth/munge CryptoType=crypto/munge JobCredentialPrivateKey=/etc/slurm/private.key JobCredentialPublicCertificate=/etc/slurm/public.key StateSaveLocation=/SLURM SlurmdSpoolDir=/var/log/slurm/spool_slurmd/ SwitchType=switch/none MpiDefault=none SlurmctldPidFile=/var/run/slurm/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid #ProctrackType=proctrack/pgid ProctrackType=proctrack/linuxproc TaskPlugin=task/affinity TaskPluginParam=Cpusets #PluginDir= CacheGroups=0 #FirstJobId= ReturnToService=0 #MaxJobCount= #PlugStackConfig= #PropagatePrioProcess= #PropagateResourceLimits= #PropagateResourceLimitsExcept= #Prolog=/data/scripts/prolog_ctld.sh #Prolog= Epilog=/data/scripts/epilog.sh #SrunProlog= #SrunEpilog= #TaskProlog= #TaskEpilog= #TaskPlugin= #TrackWCKey=no #TreeWidth=50 #TmpFS= #UsePAM= #UsePAM=1 # # TIMERS SlurmctldTimeout=300 SlurmdTimeout=300 InactiveLimit=0 MinJobAge=300 KillWait=30 Waittime=0 # # SCHEDULING SchedulerType=sched/backfill #SchedulerAuth= #SchedulerPort= #SchedulerRootFilter= #SelectType=select/linear SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK FastSchedule=1 PriorityType=priority/multifactor #PriorityDecayHalfLife=14-0 #PriorityUsageResetPeriod=14-0 PriorityWeightFairshare=0 PriorityWeightAge=0 PriorityWeightPartition=0 PriorityWeightJobSize=0 PriorityWeightQOS=1000 #PriorityMaxAge=1-0 # # LOGGING SlurmctldDebug=5 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=5 SlurmdLogFile=/var/log/slurm/slurmd.log JobCompType=jobcomp/none #JobCompLoc= # # ACCOUNTING #JobAcctGatherType=jobacct_gather/linux #JobAcctGatherFrequency=30 # #AccountingStorageType=accounting_storage/slurmdbd ##AccountingStorageHost=slurm-server #AccountingStorageLoc= #AccountingStoragePass= #AccountingStorageUser= # AccountingStorageEnforce=qos AccountingStorageLoc=slurm_acct_db AccountingStorageType=accounting_storage/slurmdbd AccountingStoragePort=8544 AccountingStorageUser=root #AccountingStoragePass=slurm AccountingStorageHost=slurm-server # ACCT_GATHER JobAcctGatherType=jobacct_gather/linux JobAcctGatherFrequency=60 #AcctGatherEnergyType=acct_gather_energy/rapl #AcctGatherNodeFreq=30 #Memoria #DefMemPerCPU=1024 # 1GB #MaxMemPerCPU=3072 # 3GB # COMPUTE NODES NodeName=foner[11-14] Procs=20 RealMemory= 258126 Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 State=UNKNOWN NodeName=foner[101-142] CPUs=20 Sockets=2 CoresPerSocket=10 ThreadsPerCore=1 RealMemory=64398 State=UNKNOWN PartitionName=thin Nodes=foner[103-142] Shared=NO PreemptMode=CANCEL State=UP MaxTime=4320 MinNodes=2 PartitionName=thin_test Nodes=foner[101,102] Default=YES Shared=NO PreemptMode=CANCEL State=UP MaxTime=60 MaxNodes=1 PartitionName=fat Nodes=foner[11-14] Shared=NO PreemptMode=CANCEL State=UP MaxTime=4320 MaxNodes=1 ##################END SLURM.CONF####################### -- Joan Francesc Arbona Ext. 2582 Centre de Tecnologies de la Informació Universitat de les Illes Balears http://jfdeu.wordpress.com http://guifisoller.wordpress.com