Re: [slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

2021-12-14 Thread Ryan Novosielski
Did a git bisect and answered my own question: “yes.” [novosirj@amarel1 Slurm_tools]$ git bisect good 72cd05d78f1077142143f20c4293c8c367ffb5a7 is the first bad commit commit 72cd05d78f1077142143f20c4293c8c367ffb5a7 Author: OleHolmNielsen Date: Fri Apr 23 15:11:37 2021 +0200 Changes related

Re: [slurm-users] How to get an estimate of job completion for planned maintenance?

2021-12-14 Thread Ryan Novosielski
Another useful format string – and again, this is if you mess up and don’t do a reservation early enough (or your environment has no concept of a time limit) – is this one: squeue -o %u,%i,%L Will show you username, job id, and remaining time – which is sometimes easier to deal with than end d

Re: [slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

2021-12-14 Thread Ryan Novosielski
Hi Ole, Thanks again for your great tools! Is something expected to have broken this script for older versions of Slurm somehow? A version we have with a file time of 1/19/21 will show job IDs and users for a given node, but the version you released yesterday does not seem to (we may have miss

Re: [slurm-users] [EXT] Re: slurmdbd full backup so the primary can be purged

2021-12-14 Thread Ransom, Geoffrey M.
Looking over these options it looks like Archive only happens for purged data. Can you archive data without actually purging data? I’d like to test archives to see the output first without risking loss of content. I was thinking I could have a nightly archive copy that is up to date to the day,

[slurm-users] work with sensitive data

2021-12-14 Thread Michał Kadlof
Hi, some of my users work with "sensitive data". Currently we use standard unix groups with ACLs to limit access but I wonder if there is any way to keep data encrypted (for example with gpg) and decrypt them "on the fly" in Slurm job and then encrypt the results again after the job is finish

Re: [slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

2021-12-14 Thread Ole Holm Nielsen
Hi Loris, It would be great if Slurm could read the GPU load using the Nvidia monitoring tools, and then make the GPUload available through "scontrol show node xxx". But I don't know if anyone has asked for (and paid) SchedMD to implement this? Best regards, Ole On 12/14/21 14:16, Loris Be

Re: [slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

2021-12-14 Thread Loris Bennett
Hi Ole, Ole Holm Nielsen writes: > The latest pestat version now adds a red color highlight if the GRES GPU is > the > (null) value. > > We use this to highlight jobs on GPU nodes which didn't request any GPU > resources, thereby possibly wasting resources. > > Could you test if this is useful

Re: [slurm-users] srun fails with "srun: error: Security violation, slurm message from uid" if delay in job starting

2021-12-14 Thread Mark Dixon
Hi all, Sorry for the noise, this was down to a problem with our configless setup. Really must start running slurmd everywhere and get rid of the compute-only version of slurm.conf... Cheers, Mark On Mon, 13 Dec 2021, Mark Dixon wrote: [EXTERNAL EMAIL] Hi all, Just wondering if anyone e