Not exactly the answer to your question (which I don't know) but if you can get to prefix whatever is executed with this https://github.com/NCAR/peak_memusage (which also uses getrusage) or a variant you will be able to do that.
On Thu, May 16, 2024 at 4:10 PM Emyr James via slurm-users < slurm-users@lists.schedmd.com> wrote: > Hi, > > We are trying out slurm having been running grid engine for a long while. > In grid engine, the cgroups peak memory and max_rss are generated at the > end of a job and recorded. It logs the information from the cgroup > hierarchy as well as doing a getrusage call right at the end on the parent > pid of the whole job "container" before cleaning up. > With slurm it seems that the only way memory is recorded is by the acct > gather polling. I am trying to add something in an epilog script to get the > memory.peak but It looks like the cgroup hierarchy has been destroyed by > the time the epilog is run. > Where in the code is the cgroup hierarchy cleared up ? Is there no way to > add something in so that the accounting is updated during the job cleanup > process so that peak memory usage can be accurately logged ? > > I can reduce the polling interval from 30s to 5s but don't know if this > causes a lot of overhead and in any case this seems to not be a sensible > way to get values that should just be determined right at the end by an > event rather than using polling. > > Many thanks, > > Emyr > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com