[slurm-users] Regarding disk Limit through Slurm
Hi, I have been working on identifying if there is a way to restrict disk usage with slurm using cgroup. I am aware of the TmpFS mechanism but it seems it just used to define the required space but does not restrict if it grows to a certain extent of a file. Any suggestion would be appreciated. Thanks in advance! Shaghuf Rahman
[slurm-users] monitoring and accounting
Hello Everyone, We would like to improve our visibility on our cluster usage. We have ganglia, and use sacct actually, but I was wondering if there was a web tool recommended to have both monitoring and accounting (user and admin friendly) ? Thanks in advance Christine
Re: [slurm-users] monitoring and accounting
At a place I worked before, we used XDMOD several years ago. It was a bit tricky to set up correctly and not exactly intuitive to get started with data collection as a user (managers, allocation specialists and other not-super-technical people were most of our users). But when familiarized with it, it worked great. At the place I work now, monitoring and accounting is low on our priority list, so it's been a while I haven't touched XDMOD. Hopefully now they have improved user and administration friendliness, while keeping all the great things that it could do. On Fri, May 5, 2023 at 7:08 AM LEROY Christine 208562 < christine.ler...@cea.fr> wrote: > Hello Everyone, > > We would like to improve our visibility on our cluster usage. > > We have ganglia, and use sacct actually, but I was wondering if there was > a web tool recommended to have both monitoring and accounting (user and > admin friendly) ? > > Thanks in advance > > Christine > > > > > > >
Re: [slurm-users] monitoring and accounting
Something I have been impressed with is Netdata It is in the standard repositories and will auto-detect quite a bit of things on a node. It is great for real-time monitoring of a node/job. I also use Prometheus and Grafana for historic data (anything over 5 minutes). Brian Andrus On 5/5/2023 6:05 AM, LEROY Christine 208562 wrote: Hello Everyone, We would like to improve our visibility on our cluster usage. We have ganglia, and use sacct actually, but I waswondering if there was a web tool recommended to have both monitoring and accounting (user and admin friendly) ? Thanks in advance Christine
[slurm-users] Limit run time of interactive jobs
Hi All, Quick question. Is there a way to limit the runtime on a partition only for salloc ? I would like for batch jobs to have a default max runtime of the partition but interactive jobs to have shortened allowed runtime. Thanks!