Re: [slurm-users] CPU & memory usage summary for a job

Jacob Jenson Mon, 10 Dec 2018 08:52:49 -0800

Would job profiling with HDF5 work as well?
https://slurm.schedmd.com/hdf5_profile_user_guide.html


Jacob


On Sun, Dec 9, 2018 at 4:17 PM Sam Hawarden <sam.hawar...@otago.ac.nz>
wrote:

> Hi Aravindh
>
> For our small 3 node cluster I've hacked together a per-node python script
> that collects current and peak cpu, memory and scratch disk usage data on
> all jobs running on the cluster and builds a fairly simple web-page based
> on it. It shouldn't be hard to make it store those data points over time,
> then shove them through an R script to plot the usage:
>
> https://github.com/shawarden/simple-web
>
> Cheers,
>   Sam
>
> ------------------------------
> Sam Hawarden
> Assistant Research Fellow
> Pathology Department
> Dunedin School of Medicine
> ------------------------------
> *From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of
> Aravindh Sampathkumar <aravi...@fastmail.com>
> *Sent:* Monday, 10 December 2018 02:39
> *To:* slurm-users@lists.schedmd.com
> *Subject:* [slurm-users] CPU & memory usage summary for a job
>
> Hi All.
>
> I was wondering if anybody has thought of or hacked around a way to record
> CPU and memory consumption of a job during its entire duration and give a
> summary of the usage pattern within that job?
> Not the MaxRSS and CPU Time that already gets reported for every job.
>
> I'm thinking more like a chart of CPU utilisation, memory usage, and disk
> usage on a per second basis or something like that.
>
> Asking because some of my users have no clue about the resource
> consumption of their jobs, and just blindly ask for way more resources as
> "safe" option. It would be a nice way for users to know simple things like
> - they asked for 8 cores, but their job ran on just 1 core the entire time
> because a library they used is single core limited.
> We use Cgroups for process accounting and limiting job's cpu and memory
> usage. We also use QoS for limiting resource reservations at user level.
>
> --
>   Aravindh Sampathkumar
>   aravi...@fastmail.com
>
>
>

Re: [slurm-users] CPU & memory usage summary for a job

Reply via email to