Hi:
The below might be a starting point. There could be syntax or core
concept errors :).
- Make two partitions. Both of them contain all nodes of interest:
bigger_p
smaller_p
- Make two reservations, like
scontrol create reservationname=20core_res Partition=bigger_p CoreCnt=20
scon
Thanks for your reply Bjorn-Helge
This cleared things up for me. I had not understood that we need to use Prolog
and Epilog for the TMPDIR stuff because that guarantees it is created at the
very beginning of the job and deleted at the very end. Everything now works as
expected, thanks so much f
I see at https://slurm.schedmd.com/cons_res_share.html that there are some
ways to share a node between partitions but I don't see how to specify a
set number of cores to each partition. Is this possible? If I have some
nodes with 36 cores, is there a way to make 16 of them be in one partition
and
"Putnam, Harry" writes:
> /opt/slurm/task_epilog
>
> #!/bin/bash
> mytmpdir=/scratch/$SLURM_JOB_USER/$SLURM_JOB_ID
> rm -Rf $mytmpdir
> exit;
This might not be the reason for what you observe, but I believe
deleting the scratch dir in the task epilog is not a good idea. The
task epilog is run a