[slurm-users] SlurmDBD errors

2024-09-18 Thread Sajesh Singh via slurm-users
OS: CentOS 8.5 Slurm: 22.05 Recently upgraded to 22.05. Upgrade was successful, but after a while I started to see the following messages in the slurmdbd.log file: error: We have more time than is possible (9344745+7524000+0)(16868745) > 12362400 for cluster CLUSTERNAME(3434) from 2024-09-18T13

[slurm-users] Re: SlurmDBD errors

2024-09-18 Thread Ryan Novosielski via slurm-users
I don’t think you should expect this from overlapping nodes in partitions, but instead whe you’re allowing hardware itself to be oversubscribed. Was your upgrade in this window? I would suggest looking for runaway jobs, which you’ve done, and am not sure what else. -- #BlackLivesMatter ||

[slurm-users] Re: SlurmDBD errors

2024-09-18 Thread Sajesh Singh via slurm-users
The upgrade was a couple of hours prior to the messages appearing in the logs. SS From: Ryan Novosielski Sent: Thursday, September 19, 2024 12:08:42 AM To: Sajesh Singh Cc: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] SlurmDBD errors EXTERNAL SENDER

[slurm-users] Change a job from --exclusive to --exclusive=user

2024-09-18 Thread Gerhard Strangar via slurm-users
Hello, is it possible to change a pending job from --exclusive to --exclusive=user? I tried scontrol update jobid=... oversubscribe=user, but it seems to only accept yes or no. Gerhard -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...