> I know you want to suspend preempted jobs, but what happens if you
> cancel them instead?

Thanks John. Your response definitely helped me. I have done as you suggested and tested CANCEL which worked.


For John and everyone else: below are the results of my tests. My apologies for the wall of text.

In my testing I believe I have only further confirmed that there is a difference between what the man page says should work and what actually happens when attempting to use SUSPEND,GANG using PreemptType qos or partition_prio.


I've verified using 'preempt/qos' that using CANCEL or REQUEUE and launching jobs on the same partition works as you say and the man page describes.

Below are my tests:

For all tests the below was configured:
# sacctmgr show qos format=name,priority,preempt -p
Name|Priority|Preempt|
normal|1||
expedite|2|normal|

QOS `expedite` can preempt QOS `normal`.


Test 1: preempt/qos, CANCEL

slurm.conf:
  PreemptType: preempt/qos
PreemptMode: 'CANCEL' # requeue works with this option as well.
  PreemptExemptTime: '00:00:00'

  PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
  PartitionName=active Default=YES QOS=normal
  PartitionName=hipri  Default=NO  QOS=expedite


sacctmgr -i modify qos where name=normal set PreemptExemptTime=00:03:00 PreemptMode=CANCEL sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 PreemptMode=OFF



Result: PASS
'normal' QOS job gets canceled and 'expedite' job starts after waiting for PreemptExemptTime.


Test 2: preempt/qos, REQUEUE

slurm.conf:
  PreemptType: preempt/qos
PreemptMode: 'CANCEL' # requeue works with this option as well.
  PreemptExemptTime: '00:00:00'

  PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
  PartitionName=active Default=YES QOS=normal
  PartitionName=hipri  Default=NO  QOS=expedite

QOS:
sacctmgr -i modify qos where name=normal set PreemptExemptTime=00:03:00 PreemptMode=REQUEUE sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 PreemptMode=OFF


Result: PASS
'normal' QOS job gets requeued and 'expedite' job starts after waiting for PreemptExemptTime.



Test 3: preempt/qos, SUSPEND,GANG

slurm.conf
  PreemptType: preempt/qos
  PreemptMode: 'SUSPEND,GANG'
  PreemptExemptTime: '00:00:00'

  PartitionName=DEFAULT OverSubscribe=FORCE:1 Nodes=slurm[2-4]
  PartitionName=active Default=YES QOS=normal
  PartitionName=hipri  Default=NO  QOS=expedite

QOS:
sacctmgr -i modify qos where name=normal set PreemptExemptTime=00:03:00 PreemptMode=SUSPEND sacctmgr -i modify qos where name=expedite set PreemptExemptTime=-1 PreemptMode=OFF

This page: https://slurm.schedmd.com/preempt.html
PreemptMode > SUSPEND > NOTE


"If PreemptType=preempt/qos is configured and if the preempted job(s) and the preemptor job from are on the same partition, then they will share resources with the Gang scheduler (time-slicing)."

Result for same partition: PASS
Submitting on the same partition with a different QOS enables the jobs share time on the same resource.


Now getting to the function I wanted:

"If not (i.e. if the preemptees and preemptor are on different partitions) then the preempted jobs will remain suspended until the preemptor ends."

Result for submitting on a different and overlapping partitions: FAIL

Submitting 'normal' QOS level jobs and then one 'expedited' job from another user results in both jobs running on the same node. No suspending, requeue, or cancel has occurred. This is not wanted, probably ever.

The desired behavior is to suspend the job and is what is described in the man page, however I don't see that occurring.


Test 4: preempt/partition_prio, SUSPEND,GANG

slurm.conf
  PreemptType: preempt/partition_prio
  PreemptMode: 'SUSPEND,GANG'
  PreemptExemptTime: '00:03:00'

PartitionName=active OverSubscribe=FORCE:1 PriorityTier=1 PreemptMode=suspend
  PartitionName=hipri OverSubscribe=FORCE:1 PriorityTier=2 PreemptMode=off

Result: FAIL
User A's job gets preempted by user B and gets suspended, which is desired, however PreemptExemptTime is not respected and the job is preempted immediately.


I see the following possibilities:

a. The man page does *not* accurately describe the function or my interpretation was incorrect.
b. I have something misconfigured.
c. I have found a bug.

Cheers,

Phil

Reply via email to