--ntasks-per-node is meant to be used in conjunction with --nodes option. From https://slurm.schedmd.com/sbatch.html:

*--ntasks-per-node*=</ntasks/>
    Request that /ntasks/ be invoked on each node. If used with the
    *--ntasks* option, the *--ntasks* option will take precedence and
    the *--ntasks-per-node* will be treated as a /maximum/ count of
    tasks per node. Meant to be used with the *--nodes* option...

If you don't specify --ntasks, it defaults to --ntasks=1, as Andreas said. https://slurm.schedmd.com/sbatch.html:

*-n*, *--ntasks*=</number/>
    sbatch does not launch tasks, it requests an allocation of
    resources and submits a batch script. This option advises the
    Slurm controller that job steps run within the allocation will
    launch a maximum of /number/ tasks and to provide for sufficient
    resources. The default is one task per node, but note that the
*--cpus-per-task* option will change this default.
So the correct way to specify your job is either like this

--ntasks=48

or

--nodes=1 --ntasks-per-node=48

Specifying both --ntasks-per-node and --ntasks at the same time is not correct.


Prentice

On 2/14/19 1:09 AM, Henkel, Andreas wrote:
Hi Marcus,

What just came to my mind: if you don’t set —ntasks isn’t the default just 1? All 
examples I know using ntasks-per-node also set ntasks with ntasks >= 
ntasks-per-node.

Best,
Andreas

Am 14.02.2019 um 06:33 schrieb Marcus Wagner <wag...@itc.rwth-aachen.de>:

Hi all,

I have narrowed this down a little bit.

the really astonishing thing is, that if I use

--ntasks=48

I can submit the job, it will be scheduled onto one host:

    NumNodes=1 NumCPUs=48 NumTasks=48 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    TRES=cpu=48,mem=182400M,node=1,billing=48

but as soon as I change --ntasks to --ntasks-per-node (which should be the 
same, as --ntasks=48 schedules onto one host), I get the error:

sbatch: error: CPU count per node can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not 
available


Is there no one else, who observes this behaviour?
Any explanations?


Best
Marcus


On 2/13/19 1:48 PM, Marcus Wagner wrote:
Hi all,

I have a strange behaviour here.
We are using slurm 18.08.5-2 on CentOS 7.6.

Let me first describe our computenodes:
NodeName=ncm[0001-1032]  CPUs=48  Sockets=4 CoresPerSocket=12 ThreadsPerCore=2 
RealMemory=185000 Feature=skx8160,hostok,hpcwork                        
Weight=10541 State=UNKNOWN

we have the following config set:

$>scontrol show config | grep -i select
SelectType              = select/cons_res
SelectTypeParameters    = CR_CORE_MEMORY,CR_ONE_TASK_PER_CORE


So, I have 48 cores on one node. According to the manpage of sbatch, I should 
be able to do the following:

#SBATCH --ntasks=48
#SBATCH --ntasks-per-node=48

But I get the following error:
sbatch: error: Batch job submission failed: Requested node configuration is not 
available


Has anyone an explanation for this?


Best
Marcus

--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de


Reply via email to