You can also use the influxdb profiling plugin I developed that’s included in 
the latest slurm version. It will provide live cpu and memory usage per task, 
step, host and job. You can then provide a grafana dashboard to display the 
live metrics

Regards,
Carlos

Sent from my iPhone

> On 9 Dec 2018, at 14:39, Aravindh Sampathkumar <aravi...@fastmail.com> wrote:
> 
> Hi All.
> 
> I was wondering if anybody has thought of or hacked around a way to record 
> CPU and memory consumption of a job during its entire duration and give a 
> summary of the usage pattern within that job? 
> Not the MaxRSS and CPU Time that already gets reported for every job. 
> 
> I'm thinking more like a chart of CPU utilisation, memory usage, and disk 
> usage on a per second basis or something like that. 
> 
> Asking because some of my users have no clue about the resource consumption 
> of their jobs, and just blindly ask for way more resources as "safe" option. 
> It would be a nice way for users to know simple things like - they asked for 
> 8 cores, but their job ran on just 1 core the entire time because a library 
> they used is single core limited. 
> We use Cgroups for process accounting and limiting job's cpu and memory 
> usage. We also use QoS for limiting resource reservations at user level. 
> 
> --
>   Aravindh Sampathkumar
>   aravi...@fastmail.com
> 
> 

Reply via email to