Hi Dave,
I can confirm that CoreSpecCount can not be reset to 0 once it is set >0
(at least for FastSchedule>0). As a workaround for this bug you can try to
stop slurmctld, remove node_state file and start slurmctld again.
Best regards,
Taras
On Fri, Aug 9, 2019 at 11:54 PM Guertin, David S.
w
Hi Paul,
I submitted the poll - thanks! For bug #7609, while I'd be happier with a built
in slurm solution, you may find that our jobscript archiver implementation
would work nicely for you. It is very high-performing and has no effect on the
scheduler, or db performance.
The solution is a mu
The save state location is where slurm stores its current information
about jobs. That location is the live data of the cluster and is what
allows it to survive restarts of the slurmctld. The slurmdbd is almost
live information and is not used by the slurmctld for current job
state. Thus if
Hello,
I apologise that this email is a bit vague, however we are keen to understand
the role of the Slurm "StateSave" location. I can see the value of the
information in this location when, for example, we are upgrading Slurm and the
database is temporarily down, however as I note above we are
just curious. if you leave out the singleton, do you get the behavior
as expected?
On Tue, Aug 27, 2019 at 9:42 AM Jarno van der Kolk wrote:
>
> Hi all,
>
> I'm still puzzled by the expected behaviour of the following:
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909273
> $ sbatch --hold
We have several pending feature requests to SchedMD regarding different
features we would like to see, as I am sure many other groups have. We
were curious if anyone else in the community is interested in these
features and if your group would be interested in talking with us
(Harvard FAS Rese