* mardi 29 mai 2018 13:16
*À :* Slurm User Community List
*Objet :* Re: [slurm-users] Using free memory available when
allocating a node to a job
Alexandre, you have made a very good point here. "Oftentimes users
only input 1G as they really have no idea of the memory requirements,"
need.
Regards,
Alexandre
De : slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] De la part de
John Hearns
Envoyé : mardi 29 mai 2018 13:16
À : Slurm User Community List
Objet : Re: [slurm-users] Using free memory available when allocating a node to
a job
Alexandre, you have made a
John Hearns writes:
> Alexandre, you have made a very good point here. "Oftentimes users only input
> 1G as they really have no idea of the memory requirements,"
> At my last job we introduced cgroups. (this was in PBSPro). We had to enforce
> a minumum request for memory.
> Users then asked us
nk you for your inputs.
>
>
>
>
>
> *De :* slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] *De la
> part de* John Hearns
> *Envoyé :* mardi 29 mai 2018 12:39
> *À :* Slurm User Community List
> *Objet :* Re: [slurm-users] Using free memory available when all
for your inputs.
De : slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] De la part de
John Hearns
Envoyé : mardi 29 mai 2018 12:39
À : Slurm User Community List
Objet : Re: [slurm-users] Using free memory available when allocating a node to
a job
Also regarding memory, there are system
Also regarding memory, there are system tunings you can set for the
behaviour of the OurOfMemory Killer and also the VM overcommit.
I have seen the VM overcommit parameters being discussed elsewhere, and
generally for HPC people advise to disable overcommit
https://www.suse.com/support/kb/doc/?id=
Alexandre, it would be helpful if you could say why this behaviour is
desirable.
For instance, do you have codes which need a large amount of memory and
your users are seeing that these codes are crashing because other codes
running on the same nodes are using memory.
I have two thoughts:
A) en
Hi,
in the cluster where I'm deploying Slurm the job allocation has to be based on
the actual free memory available on the node, not just the allocated by Slurm.
This is nonnegotiable and I understand that it's not how Slurm is designed to
work, but I'm trying anyway.
Among the solutions that