Hi Andreas,

I'll try to sum this up ;)

First of all, I used now a Broadwell node, so there is no interference with Skylake and SubNuma clustering.

We are using slurm 18.08.5-2

I have configured the node as slurmd -C tells me:
NodeName=lnm596          Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 RealMemory=120000 Feature=bwx2650,hostok,hpcwork                        Weight=10430 State=UNKNOWN

This is, what slurmctld knows about the node:
NodeName=lnm596 Arch=x86_64 CoresPerSocket=12
   CPUAlloc=0 CPUTot=48 CPULoad=0.03
   AvailableFeatures=bwx2650,hostok,hpcwork
   ActiveFeatures=bwx2650,hostok,hpcwork
   Gres=(null)
   GresDrain=N/A
   GresUsed=gpu:0
   NodeAddr=lnm596 NodeHostName=lnm596 Version=18.08
   OS=Linux 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019
   RealMemory=120000 AllocMem=0 FreeMem=125507 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=10430 Owner=N/A MCS_label=N/A
   Partitions=future
   BootTime=2019-02-19T07:43:33 SlurmdStartTime=2019-02-20T12:08:54
   CfgTRES=cpu=48,mem=120000M,billing=48
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=120 LowestJoules=714879 ConsumedJoules=8059263
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


Lets first begin with half of the node:

--ntasks=12 -> 12 CPUs asked. I implicitly get the hyperthread for free (besides the accounting).
   NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=24,mem=120000M,energy=46,node=1,billing=24
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=120000M MinTmpDiskNode=0

--ntasks=12 --cpus-per-tasks=2 -> 24 CPUs asked. I now have explicitly asked for 24 CPUs
   NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=24,mem=120000M,energy=55,node=1,billing=24
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
   MinCPUsNode=2 MinMemoryNode=120000M MinTmpDiskNode=0

--ntasks=12 --ntasks-per-node=12 --cpus-per-tasks=2 -> 24 CPUs asked. Additional constraint: All 12 tasks should be on one node. I also asked here for 24 CPUs.
   NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=24,mem=120000M,energy=55,node=1,billing=24
   Socks/Node=* NtasksPerN:B:S:C=12:0:*:1 CoreSpec=*
   MinCPUsNode=24 MinMemoryNode=120000M MinTmpDiskNode=0

Everything good up to now. Now I'll try to use the full node:

--ntasks=24 -> 24 CPUs asked, implicitly got 48.
   NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=120000M MinTmpDiskNode=0

--ntasks=24 --cpus-per-tasks=2 -> 48 CPUs explicitly asked.
   NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
   MinCPUsNode=2 MinMemoryNode=120000M MinTmpDiskNode=0

And now the funny thing, I don't understand
--ntasks=24 --ntasks-per-node=24 --cpus-per-tasks=2 -> 48 CPUs asked, all 24 tasks on one node. Slurm tells me: sbatch: error: Batch job submission failed: Requested node configuration is not available

I would have expected the following job, which would have fit onto the node:
   NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
   TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
   Socks/Node=* NtasksPerN:B:S:C=24:0:*:1 CoreSpec=*
   MinCPUsNode=48 MinMemoryNode=120000M MinTmpDiskNode=0

part of the sbatch -vvv output:
sbatch: ntasks            : 24 (set)
sbatch: cpus_per_task     : 2
sbatch: nodes             : 1 (set)
sbatch: sockets-per-node  : -2
sbatch: cores-per-socket  : -2
sbatch: threads-per-core  : -2
sbatch: ntasks-per-node   : 24
sbatch: ntasks-per-socket : -2
sbatch: ntasks-per-core   : -2

So, again, I see 24 tasks per node, 2 cpus per task and 1 node. This is altogether 48 CPUs on one node. Which fits perfectly, as one can see with the last two examples Sprich 24 tasks pro Knoten, 2 cpus pro task, 1 Knoten. Macht bei mir immer noch 48 CPUs.


I just ask explicitly what slurm already gives me implicitly, or have I understood something wrong.

We will have to look into this further internally. Might be we have to give up CR_ONE_TASK_PER_CORE.


Best
Marcus

P.S.:
Sorry for the lengthy post

On 2/20/19 11:59 AM, Henkel wrote:
Hi Chris,
Hi Marcus,

Just want to understand the cause, too. I'll try to sum it up.

Chris you have

CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2

and

srun -C gpu -N 1 --ntasks-per-node=80 hostname

works.

Marcus has configured

CPUs=48  Sockets=4 CoresPerSocket=12 ThreadsPerCore=2
(slurmd -C says CPUs=96 Boards=1 SocketsPerBoard=4 CoresPerSocket=12
ThreadsPerCore=2)

and

CR_ONE_TASK_PER_CORE

and

srun -n 48 WORKS

srun -N 1 --ntasks-per-node=48 DOESN'T WORK.

I'm not sure if it's caused by CR_ONE_TASK_PER_CORE but at least that's
one of the major differences. I'm wondering if the effort to force using
only physical cores is doubled by removing the 48 Threads AND setting
CR_ONE_TAKS_PER_CORE. My impression is that with CR_ONE_TASK_PER_CORE
ntasks-per-node accounts for threads (you have set ThreadsPerCore=2),
hence only 24 may work but CR_ONE_TASK_PER_CORE doen't affect the
selection of 'cores only' with ntasks.

We don't use CR_ONE_TASK_PER_CORE but our users either set -c 2 or
--hint=nomultithread, which results in core-only.

You could also enforce this with a job-submit-plugin or lua-plugin.

The fact that CR_ONE_TASK_PER_CORE is described as "under changed" in
the public bugs and that there is a non-accessible bug about this
probably points to better not use this unless you have to.

Best,

Andreas

On 2/20/19 7:49 AM, Chris Samuel wrote:
On Tuesday, 19 February 2019 10:14:21 PM PST Marcus Wagner wrote:

sbatch -N 1 --ntasks-per-node=48 --wrap hostname
submission denied, got jobid 199805
On one of our 40 core nodes with 2 hyperthreads:

$ srun -C gpu -N 1 --ntasks-per-node=80 hostname | uniq -c
      80 nodename02

The spec is:

CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2

Hope this helps!

All the best,
Chris

--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de


Reply via email to