[slurm-users] Re: [ext] Restricting local disk storage of jobs

2024-02-06 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi Tim, we are using the container/tmpfs plugin to map /tmp to a local NVMe drive which works great. I did consider setting up directory quotas. I thought the InitScript [1] option should do the trick. Alas, I didn't get it to work. If I remember correctly, slurm complained about the option being p

[slurm-users] Re: [ext] Restricting local disk storage of jobs

2024-02-06 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi Tim, in the end the InitScript didn't contain anything useful because slurmd: error: _parse_next_key: Parsing error at unrecognized key: InitScript At this stage I gave up. This was with SLURM 23.02. My plan was to setup the local scratch directory with XFS and then get the script to apply a

[slurm-users] Re: [ext] Re: canonical way to run longer shell/bash interactive job (instead of srun inside of screen/tmux at front-end)?

2024-02-28 Thread Hagdorn, Magnus Karl Moritz via slurm-users
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote: > for us, we put a load balancer in front of the login nodes with > session > affinity enabled. This makes them land on the same backend node each > time. Hi Brian, that sounds interesting - how did you implement session affin

[slurm-users] Re: [ext] scrontab question

2024-05-07 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hm, strange. I don't see a problem with the time specs, although I would use */5 * * * * to run something every 5 minutes. In my scrontab I also specify a partition, etc. But I don't think that is necessary. regards magnus On Di, 2024-05-07 at 12:06 -0500, Sandor via slurm-users wrote: > I am work

[slurm-users] Re: [ext] API - Specify GPUs

2024-07-26 Thread Hagdorn, Magnus Karl Moritz via slurm-users
On Fr, 2024-07-26 at 19:34 +, jpuerto--- via slurm-users wrote: > It does not seem that the REST API allows for folks to configure > their jobs to utilize GPUs, using the traditional methods. IE, there > does not appear to be an equivalent between the --gpus (or --gres) > flag on sbatch/srun an

[slurm-users] slurmrestd health check

2025-02-19 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi there, we use haproxy to distribute SLURM REST API requests to multiple instances of slurmrestd. For the haproxy we need a health check. At the moment we are just checking that we get a 401 status. This works but we are ending up with a lot of noise in the log files. It would be very nice if th

[slurm-users] read-only slurm user

2025-06-23 Thread Hagdorn, Magnus Karl Moritz via slurm-users
Hi there, we use the slurm prometheus exporter to collect slurm metrics. This works pretty well. However, we have noticed that metrics for some of the restricted partitions are not collected. It occurred to me that this is because we are using an unprivileged user to run the exporter. I am trying