Your slurm.conf line doesn't specify the node's physical memory:
NodeName=ozd2485u Gres=gpu:2 Sockets=2 CoresPerSocket=14
ThreadsPerCore=2 State=UNKNOWN
See "man slurm.conf":
RealMemory
Size of real memory on the node in megabytes (e.g.
"2048"). The default value is 1.
On 4/9/19 8:47 AM, sudhagar s wrote:
Attaching my slurm.conf file. can you please help me to find the issue.
On Tue, Apr 9, 2019 at 12:08 PM Ole Holm Nielsen
<ole.h.niel...@fysik.dtu.dk <mailto:ole.h.niel...@fysik.dtu.dk>> wrote:
On 09-04-2019 08:33, sudhagar s wrote:
> Thanks Ole,
>
> when i give "scontrol show node" it list down the details. where
i can
> see RealMemory=1 is this will be a problem?
In your "scontrol show node" image I read RealMemory=1 (units of MB)
and
mem=1M. I think you configured slurm.conf incorrectly.
> On Tue, Apr 9, 2019 at 11:53 AM Ole Holm Nielsen
> <ole.h.niel...@fysik.dtu.dk <mailto:ole.h.niel...@fysik.dtu.dk>
<mailto:ole.h.niel...@fysik.dtu.dk
<mailto:ole.h.niel...@fysik.dtu.dk>>> wrote:
>
> On 09-04-2019 07:37, sudhagar s wrote:
> > Hi, Iam newbee in slurm. trying to setup a cluster for ML
training
> > purpose. i created controle node and compute node. both are up
> and running.
> >
> > when i enter "srun -N 1 hostname" it says
> > " srun error memory specification can not be satisfied"
> > "unable to allocate resources: requested node
configuration is not
> > available"
> >
> > how to fix this?
>
> Probably you made some errors in configuring slurm.conf.
Look at your
> NodeName and PartitionName definitions to figure out why the
resources
> are incorrect.
>
> /Ole