[slurm-users] Re: memory high water mark reporting

Davide DelVento via slurm-users Thu, 16 May 2024 16:03:38 -0700

Not exactly the answer to your question (which I don't know) but if you can
get to prefix whatever is executed with this
https://github.com/NCAR/peak_memusage (which also uses getrusage) or a
variant you will be able to do that.


On Thu, May 16, 2024 at 4:10 PM Emyr James via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> Hi,
>
> We are trying out slurm having been running grid engine for a long while.
> In grid engine, the cgroups peak memory and max_rss are generated at the
> end of a job and recorded. It logs the information from the cgroup
> hierarchy as well as doing a getrusage call right at the end on the parent
> pid of the whole job "container" before cleaning up.
> With slurm it seems that the only way memory is recorded is by the acct
> gather polling. I am trying to add something in an epilog script to get the
> memory.peak but It looks like the cgroup hierarchy has been destroyed by
> the time the epilog is run.
> Where in the code is the cgroup hierarchy cleared up ? Is there no way to
> add something in so that the accounting is updated during the job cleanup
> process so that peak memory usage can be accurately logged ?
>
> I can reduce the polling interval from 30s to 5s but don't know if this
> causes a lot of overhead and in any case this seems to not be a sensible
> way to get values that should just be determined right at the end by an
> event rather than using polling.
>
> Many thanks,
>
> Emyr
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: memory high water mark reporting

Reply via email to