Have been trying to get preemtion to work for some time now. Goal is to have a
specific partition for "cheap" jobs. It is allowed to use this freely, but as
soon as a higher priority job enters the queue, these "cheap" jobs should be
canceled. However, we would like to use the gracetime to kill the jobs
gracefully - and this is something I fail to achieve.
CentOS-7.3, slurm-17.02.7.el7, OpenMPI-1.10.7.
Have noticed the following:
1. Preemption in general works for me, it is the "signaling" part that bugs me.
2. When running a mpi job it gets killed as soon as Slurm detects a higher
priority job (no matter if submitted via sbatch/mpirun or srun).
3. For a "sleep script" like the following:
###################################
#!/bin/bash -l
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --error=err.txt
#SBATCH -p cheap
#SBATCH -n 64
#SBATCH -t 12:00:00
sig_cont()
{
echo "function sig_cont called. Exiting"
echo 'sig_cont' > slask_cont
}
sig_term()
{
echo "function sig_term called. Exiting"
echo 'sig_term' > slask_term
}
sig_kill()
{
echo "function sig_kill called. Exiting"
echo 'sig_term' > slask_kill
}
trap 'sig_cont' SIGCONT
trap 'sig_term' SIGTERM
trap 'sig_kill' SIGKILL
sleep 400
##########################################
It seems as if none of the signals are detected until AFTER the grace time is
over, and then it is SIGCONT and SIGTERM that are being detected, i.e., nothing
seems to be detected at first notice of the priority clash!
The nodes and partitions in slurm.conf:
NodeName=f00[1-5] NodeAddr=10.1.0.[1-5] Sockets=2 CoresPerSocket=16
ThreadsPerCore=1 State=UNKNOWN
PartitionName=cheap Nodes=ALL Priority=1 PreemptMode=CANCEL GraceTime=20
Default=YES MaxTime=INFINITE State=UP:
PartitionName=paid_jobs Nodes=ALL Priority=1000 PreemptMode=OFF Default=YES
MaxTime=INFINITE State=UP:
and
PreemptType=preempt/partition_prio
PreemptMode=CANCEL
Thanks!
/jon