Alexander Grund wrote:
> Although it may be better to not drain it, I'm a bit nervous with "exit
> 0" as it is very important that the job does not start/continue, i.e.
> the user code (sbatch script/srun) is never executed in that case.
> So I want to be sure that an `scancel` on the job in its
Am 19.06.23 um 17:32 schrieb Gerhard Strangar:
Try to exit with 0, because it's not your prolog that failed.
That seemingly works.
I do see a value in exiting with 1 to drain the node to investigate
why/what has exactly failed.
Although it may be better to not drain it, I'm a bit nervous wit
Alexander Grund wrote:
> Our first approach with `scancel $SLURM_JOB_ID; exit 1` doesn't seem to
> work as the (sbatch) job still gets re-queued.
Try to exit with 0, because it's not your prolog that failed.
Hi,
We are doing some checking on the users Job inside the prolog script and
upon failure of those checks the job should be canceled.
Our first approach with `scancel $SLURM_JOB_ID; exit 1` doesn't seem to
work as the (sbatch) job still gets re-queued.
Is this possible at all (i.e. prevent