Re: [slurm-users] Restarting jobs

Paul Brunk Fri, 19 Aug 2022 05:26:11 -0700

Hi Nicolas!

In Slurm lingo this is "job requeueing".  The JobRequeue
slurm.conf parameter controls whether Slurm tries to start those
jobs again (requeue vs. job exit).


The slurm.conf doc puts it nicely:

This option controls the default ability for batch jobs to be
requeued. Jobs may be requeued explicitly by a system
administrator, after node failure, or upon preemption by a
higher priority job. If JobRequeue is set to a value of 1, then
batch jobs may be requeued unless explicitly disabled by the
user. If JobRequeue is set to a value of 0, then batch jobs will
not be requeued unless explicitly enabled by the user. Use the
sbatch --no-requeue or --requeue option to change the default
behavior for individual jobs. The default value is 1.

--
Paul Brunk, system administrator
Advanced Computing Resource Center
Enterprise IT Svcs, the University of Georgia


On 8/18/22, 1:57 PM, "slurm-users" <slurm-users-boun...@lists.schedmd.com> 
wrote:
Hi!

In this week, my machines rebooted and the jobs that was running restarted and 
I've lost the progress that it made. So, can I prevent that restart of jobs? 
For example if my machines reboot the jobs get cancelled.


Thanks you.
Nícolas

Re: [slurm-users] Restarting jobs

Reply via email to