* Tina Friedrich <tina.friedr...@it.ox.ac.uk> [210521 16:35]:

> If this is simply about quickly accessing nodes that they have jobs on to
> check on them - we tell our users to 'srun' into a job allocation (srun
> --jobid=XXXXXX).

Hi Tina,

sadly, this does not always work in version 20.11.x any more because of the
new non-overlapping default behaviour for job step allocations.

$ sbatch -n 1 --wrap="srun sleep 600"
Submitted batch job 2550804
$ squeue --me
     JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
    2550804  standard     wrap user01   R       0:06      1 n0326

$ srun --jobid=2550804 --pty /bin/bash
srun: Job 2550804 step creation temporarily disabled, retrying (Requested nodes 
are busy)

(and hangs forever untig Ctrl-C'ed ...)

^Csrun: Cancelled pending job step with signal 2
srun: error: Unable to create step for job 2550804: Job/step already completing 
or completed
$

This now needs --overlap option for both, the job allocation itself and the
srun command that attaches the shell, in order to always work as before. 

Best regards
Jürgen



Reply via email to