I'm trying to puzzle out using QOS-based preemption instead of partition-based
so we can have the juicy prize of PreemptExemptTime. But in the process, I've
encountered something that puzzles ME.
I have 2 partitions that, for the purposes of testing, are identical except for
the QOS they have attached to them. Both partitions point to a single node and
both have "Oversubscribe: NO" set. I'll call them open and sla-prio partitions.
I then start 2 jobs which both ask for a majority of the cores on the node.
The only difference between the 2 sbatchs are that they use different
partitions and qos. I use the qos to try to tell them how to preempt and who
has priority.
QOS
Name Preempt PreemptMode
sla open cluster
open requeue
slurm.conf
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
PreemptMode=SUSPEND,GANG
PreemptType=preempt/qos
PartitionName=open Nodes=t-sc-1101 default=YES QOS=open CpuBind=core
OverSubscribe=No
PartitionName=sla-prio Nodes=t-sc-1101 default=NO QOS=sla CpuBind=core
OverSubscribe=No
What I'm finding is that, when I start the "lower priority" open QOS job on the
open partition, it starts running on the node, taking more than half the cores.
I then start the "higher priority" job on the sla-prio partition with the sla
QOS. I would expect:
1. The sla job would preempt the open job (cancel or requeue) because of the
QOS settings .
2. That no matter what, the jobs would NOT share resources, as both
partitions are set to OverSubscribe=NO.
Yet when I start both jobs, I find them both running happily on the node. Since
they both asked for more than half of the cores, then they are clearly sharing
resources. I have found that if I make each job ask for ALL of the cores on
the node, THEN the preemption happens.
I'm sure I've wandered into some completely weird slurm backwaters with
settings that no sane idiot would ever use...but I'm just trying to figure out
what combination of settings ends up with oversubscribe happening when I
thought I REALLY indicated I didn't want oversubscribe to be happening.
Thanks for any help.