Is there a reason to run them as a single job?

It may be easier to just have 2 separate jobs of 16 cores each.

If there are dependency requirements, that is addressed by adding any dependencies to the job submission.

Brian Andrus

On 7/25/2020 2:50 AM, Даниил Вахрамеев wrote:
Hi everyone!

I have SLURM cluster with several nodes with 16 vcpus per node. I've tried to run the following code:

|#SBATCH --nodes 2 #SBATCH --ntasks 2 #SBATCH -c 16 srun --exclusive --nodes=1 program1 & srun --exclusive --nodes=1 program2 & wait |

|program1| and |program2| needs 16cpus each and I expected that 2 nodes with 32 cores would be allocated and |program1| would be ran on the first node and |program2| on the second one, but I got the following error message:

|srun: error: Unable to create step for job 364966: Requested node configuration is not available |

If I use only |--nodes| and |--ntasks| keys, sbatch allocates 2 nodes with 2 cpus and if I use |--nodes| and |-c| options, I get message that |--ntasks| should be defined.

If I set |--ntasks=1|, SLURM set nnodes to 1.

How can I run this two programs in one batch, each on one node and 16 vcpus?

------

Kind regards,

Daniil Vakhrameev


Reply via email to