New to slurm, so this might be a stupid question. Using slurm-17.02.7-1.el7 on a small centos-7.3 based cluster.
On it we are running an in house mpi-application (OpenMPI-1.10.7, built against intel compilers), and it is all running nicely, no problem submitting the jobs. There are two partitions, PartitionName=cheap Nodes=ALL Priority=1 PreemptMode=CANCEL GraceTime=20 Default=YES MaxTime=INFINITE State=UP: PartitionName=paid_jobs Nodes=ALL Priority=1000 PreemptMode=OFF Default=YES MaxTime=INFINITE State=UP: Goal is to to have the "cheap" jobs be canceled by the paid ones. And this indeed seems to be working. However, when running the mpi-application - started with mpirun - it fails. The reason for this seems to be that the mpi application is killed on the first signal (SIGTERM I think) when slurm detects the higher priority job. Since we would like to use the grace time to cancel the job this is not optimal. In an attempt to circumvent this I tried to instead start the application using srun (instead of mpirun). But then our application fails due to a test where "size" from the call: call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror) is compared to the number of partitions used in the calculation. When started with srun this variable always seems to be 1?? I realize that I probably have missed something basic here, and enlightenment would be greatly appreciated! ;-) Thanks! /jon