On 29/08/18 09:10, Priedhorsky, Reid wrote:
This is surprising to me, as my interpretation is that the first run
should allocate only one CPU, leaving 35 for the second srun, which
also only needs one CPU and need not wait.
Is this behavior expected? Am I missing something?
That's odd - and I can reproduce what you see here with Slurm 17.11.7!
However, on an older system I have access to where I know this technique
is used with 16.05.8 it does work.
My test script is:
---------------8< snip snip 8<---------------
#!/bin/bash
#SBATCH -n2
#SBATCH -c2
#SBATCH --mem-per-cpu=2g
srun -n1 --mem-per-cpu=500m sleep 5 &
srun -n1 --mem-per-cpu=1g hostname
---------------8< snip snip 8<---------------
On the older system it just prints the hostname, on the newer system
I get the warning:
srun: Job 1241799 step creation temporarily disabled, retrying
Very odd...
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC