Re: [slurm-users] Job cancelled into the future

2022-12-20 Thread Brian Andrus
Seems like the time may have been off on the db server at the insert/update. You may want to dump the database, find what table/records need updated and try updating them. If anything went south, you could restore from the dump. Brian Andrus On 12/20/2022 11:51 AM, Reed Dier wrote: Just to

Re: [slurm-users] Job cancelled into the future

2022-12-20 Thread Reed Dier
Just to followup with some things I’ve tried: scancel doesn’t want to touch it: > # scancel -v 290710 > scancel: Terminating job 290710 > scancel: error: Kill job error on job id 290710: Job/step already completing > or completed pscontrol does see that these are all members of the same array, b

[slurm-users] Slurm version 22.05.7 is now available

2022-12-20 Thread Marshall Garey
We are pleased to announce the availability of Slurm version 22.05.7. This includes a number of minor to moderate bug fixes, including fixing an issue when upgrading to MariaDB >= 10.2.1 from an older release. Slurm can be downloaded from https://www.schedmd.com/downloads.php . - Marshall --

Re: [slurm-users] Job cancelled into the future

2022-12-20 Thread Reed Dier
2 votes for runawayjobs is a strong vote (and also something I’m glad to learn exists for the future), however, it does not appear to be the case. > # sacctmgr show runawayjobs > Runaway Jobs: No runaway jobs found on cluster $cluster So unfortunately that doesn’t appear to be the culprit. Appr

Re: [slurm-users] Job cancelled into the future

2022-12-20 Thread Brian Andrus
Try:     sacctmgr list runawayjobs Brian Andrus On 12/20/2022 7:54 AM, Reed Dier wrote: Hoping this is a fairly simple one. This is a small internal cluster that we’ve been using for about 6 months now, and we’ve had some infrastructure instability in that time, which I think may be the roo

Re: [slurm-users] Job cancelled into the future

2022-12-20 Thread Sarlo, Jeffrey S
Do they show up as run away jobs? sacctmgr show runawayjobs If they do, it should give you the option to fix them. Jeff From: slurm-users On Behalf Of Reed Dier Sent: Tuesday, December 20, 2022 9:54 AM To: Slurm User Community List Subject: [slurm-users] Job cancelled into the future Hoping

[slurm-users] Job cancelled into the future

2022-12-20 Thread Reed Dier
Hoping this is a fairly simple one. This is a small internal cluster that we’ve been using for about 6 months now, and we’ve had some infrastructure instability in that time, which I think may be the root culprit behind this weirdness, but hopefully someone can point me in the direction to solv

[slurm-users] SlurmCommander

2022-12-20 Thread Petar Jager
Dear slurm-users community, we, the Vienna BioCenter HPC team, would like to share with you the SlurmCommander, a simple, lightweight, no-dependencies text-based user interface (TUI) to your slurm cluster. https://github.com/CLIP-HPC/SlurmCommander Behind the scenes, it combines JSON output of m