Re: [slurm-users] Disabling SWAP space will it effect SLURM working

2023-12-11 Thread Davide DelVento
A little late here, but yes everything Hans said is correct and if you are
worried about slurm (or other critical system software) getting killed by
OOM, you can workaround it by properly configuring cgroup.

On Wed, Dec 6, 2023 at 2:06 AM Hans van Schoot  wrote:

> Hi Joseph,
>
> This might depend on the rest of your configuration, but in general swap
> should not be needed for anything on Linux.
> BUT: you might get OOM killer messages in your system logs, and SLURM
> might fall victim to the OOM killer (OOM = Out Of Memory) if you run
> applications on the compute node that eat up all your RAM.
> Swap does not prevent against this, but makes it less likely to happen.
> I've seen OOM kill slurm daemon processes on compute nodes with swap,
> usually slurm recovers just fine after the application that ate up all the
> RAM ends up getting killed by the OOM killer. My compute nodes are not
> configured to monitor memory usage of jobs. If you have memory configured
> as a managed resource in your SLURM setup, and you leave a bit of headroom
> for the OS itself (e.g. only hand our a maximum of 250GB RAM to jobs on
> your 256GB RAM nodes), you should be fine.
>
> cheers,
> Hans
>
>
> ps. I'm just a happy slurm user/admin, not an expert, so I might be wrong
> about everything :-)
>
>
>
> On 06-12-2023 05:57, John Joseph wrote:
>
> Dear All,
> Good morning
> We have 4 node   [256 GB Ram in each node]  SLURM instance  with which we
> installed and it is working fine.
> We have 2 GB of SWAP space on each node,  for some purpose  to make the
> system in full use want to disable the SWAP memory,
>
> Like to know if I am disabling the SWAP  partition will it efffect SLURM
> functionality .
>
> Advice requested
> Thanks
> Joseph John
>
>
>


Re: [slurm-users] Troubleshooting job stuck in Pending state

2023-12-11 Thread Davide DelVento
By getting "stuck" do you mean the job stays PENDING forever or does
eventually run? I've seen the latter (and I agree with you that I wish
Slurm will log things like "I looked at this job and I am not starting it
yet because") but not the former

On Fri, Dec 8, 2023 at 9:00 AM Pacey, Mike  wrote:

> Hi folks,
>
>
>
> I’m looking for some advice on how to troubleshoot jobs we occasionally
> see on our cluster that are stuck in a pending state despite sufficient
> matching resources being free. In the case I’m trying to troubleshoot the
> Reason field lists (Priority) but to find any way to get the scheduler to
> tell me what exactly is the priority job blocking.
>
>
>
>- I tried setting the scheduler log level to debug3 for 5 minutes at
>one point, but my logfile ballooned from 0.5G to 1.5G and didn’t offer any
>useful info for this case.
>- I’ve tried ‘scontrol schedloglevel 1’ but it returns the error:
>‘slurm_set_schedlog_level error: Requested operation is presently disabled’
>
>
>
> I’m aware that the backfill scheduler will occasionally hold on to free
> resources in order to schedule a larger job with higher priority, but in
> this case I can’t find any pending job that might fit the bill.
>
>
>
> And to possibly complicate matters, this is on a large partition that has
> no maximum time limit and most pending jobs have no time limits either. (We
> use backfill/fairshare as we have smaller partitions of rarer resources
> that benefit from it, plus we’re aiming to use fairshare even on the
> no-time-limits partitions to help balance out usage).
>
>
>
> Hoping someone can provide pointers.
>
>
>
> Regards,
>
> Mike
>


Re: [slurm-users] Disabling SWAP space will it effect SLURM working

2023-12-11 Thread Paul Edmon
We've been running for years with out swap on with no issues. You may 
want to set MemSpecLimit in your config to reserve memory for the OS, so 
that way you don't OOM the system with user jobs: 
https://slurm.schedmd.com/slurm.conf.html#OPT_MemSpecLimit


-Paul Edmon-

On 12/11/2023 11:19 AM, Davide DelVento wrote:
A little late here, but yes everything Hans said is correct and if you 
are worried about slurm (or other critical system software) getting 
killed by OOM, you can workaround it by properly configuring cgroup.


On Wed, Dec 6, 2023 at 2:06 AM Hans van Schoot  wrote:

Hi Joseph,

This might depend on the rest of your configuration, but in
general swap should not be needed for anything on Linux.
BUT: you might get OOM killer messages in your system logs, and
SLURM might fall victim to the OOM killer (OOM = Out Of Memory) if
you run applications on the compute node that eat up all your RAM.
Swap does not prevent against this, but makes it less likely to
happen. I've seen OOM kill slurm daemon processes on compute nodes
with swap, usually slurm recovers just fine after the application
that ate up all the RAM ends up getting killed by the OOM killer.
My compute nodes are not configured to monitor memory usage of
jobs. If you have memory configured as a managed resource in your
SLURM setup, and you leave a bit of headroom for the OS itself
(e.g. only hand our a maximum of 250GB RAM to jobs on your 256GB
RAM nodes), you should be fine.

cheers,
Hans


ps. I'm just a happy slurm user/admin, not an expert, so I might
be wrong about everything :-)



On 06-12-2023 05:57, John Joseph wrote:

Dear All,
Good morning
We have 4 node   [256 GB Ram in each node] SLURM instance  with
which we installed and it is working fine.
We have 2 GB of SWAP space on each node,  for some purpose  to
make the system in full use want to disable the SWAP memory,

Like to know if I am disabling the SWAP partition will it efffect
SLURM  functionality .

Advice requested
Thanks
Joseph John



[slurm-users] powersave: excluding nodes

2023-12-11 Thread Davide DelVento
Following the example from https://slurm.schedmd.com/power_save.html
regarding SuspendExcNodes

I configured my slurm.conf with

SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
SuspendExcStates=down,drain,fail,maint,not_responding,reserved
#SuspendExcParts=

(the nodes in the different groups have different amounts of physical
memory).

Unfortunately, it seems to me that slurm does not honor such a setting and
excludes only the two nodes from one group, but shuts off everything else.
Is there another setting which may inadvertently cause this problem, or
that's a known bug?

Thanks!


Re: [slurm-users] Slurm powersave

2023-12-11 Thread Davide DelVento
In case it's useful to others: I've been able to get this working by having
the "no action" script stop the slurmd daemon and start it *with the -b
option*.

On Fri, Oct 6, 2023 at 4:28 AM Ole Holm Nielsen 
wrote:

> Hi Davide,
>
> On 10/5/23 15:28, Davide DelVento wrote:
> > IMHO, "pretending" to power down nodes defies the logic of the Slurm
> > power_save plugin.
> >
> > And it is sure useless ;)
> > But I was using the suggestion from
> > https://slurm.schedmd.com/power_save.html
> >  which says
> >
> > You can also configure Slurm with programs that perform no action as
> > *SuspendProgram* and *ResumeProgram* to assess the potential impact of
> > power saving mode before enabling it.
>
> I had not noticed the above sentence in the power_save manual before!  So
> I decided to test a "no action" power saving script, similar to what you
> have done, applying it to a test partition.  I conclude that "no action"
> power saving DOES NOT WORK, at least in Slurm 23.02.5.  So I opened a bug
> report https://bugs.schedmd.com/show_bug.cgi?id=17848 to find out if the
> documentation is obsolete, or if there may be a bug.  Please follow that
> bug to find out the answer from SchedMD.
>
> What I *believe* (but not with 100% certainty) really happens with power
> saving in the current Slurm versions is what I wrote yesterday:
>
> > Slurmctld expects suspended nodes to *really* power
> > down (slurmd is stopped).  When slurmctld resumes a suspended node,
> it
> > expects slurmd to start up when the node is powered on.  There is a
> > ResumeTimeout parameter which I've set to about 15-30 minutes in
> case of
> > delays due to BIOS updates and the like - the default of 60 seconds
> is
> > WAY too small!
>
> I hope this helps,
> Ole
>
>


Re: [slurm-users] powersave: excluding nodes

2023-12-11 Thread Davide DelVento
Forgot to mention: this is with slurm 23.02.6 (apologize for the double
message)

On Mon, Dec 11, 2023 at 9:49 AM Davide DelVento 
wrote:

> Following the example from https://slurm.schedmd.com/power_save.html
> regarding SuspendExcNodes
>
> I configured my slurm.conf with
>
> SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
> SuspendExcStates=down,drain,fail,maint,not_responding,reserved
> #SuspendExcParts=
>
> (the nodes in the different groups have different amounts of physical
> memory).
>
> Unfortunately, it seems to me that slurm does not honor such a setting and
> excludes only the two nodes from one group, but shuts off everything else.
> Is there another setting which may inadvertently cause this problem, or
> that's a known bug?
>
> Thanks!
>