My concern was you config inadvertantly having that line commented out
and then seeing problems. If it wasn't then no worries at this point.
We run using preempt/partition_prio on our cluster and have a mix of
partitions using PreemptMode=OFF and PreemptMode=REQUEUE. So I know that
combination works. I would be surprised if PreemptMode=CANCEL did not
work as that's a valid option.
Something we do have set though is what the default mode is. We have set:
### Govern's default preemption behavior
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE
So you might try setting that default of PreemptMode=CANCEL and then set
specific PreemptModes for all your partitions. That's what we do and it
works for us.
-Paul Edmon-
On 1/12/2024 10:33 AM, Davide DelVento wrote:
Thanks Paul,
I don't understand what you mean by having a typo somewhere. I mean,
that configuration works just fine right now, whereas if I add the
commented out line any slurm command will just abort with the error
"PreemptType and PreemptMode values incompatible". So, assuming there
is a typo, it should be in the commented line right? Or are you saying
that having that line makes slurm sensitive to a typo somewhere else
that would be otherwise ignored? Obviously I can't exclude that
option, but it seems unlikely to me. Also because it does say these
two things are incompatible.
It would obviously much better if the error would say what EXACTLY is
incompatible with what, but the documentation at
https://slurm.schedmd.com/preempt.html I see many clues of what that
could be, and hence I am asking people here who may have deployed
preemption already on their system. Some excerpts from that URL:
*PreemptType*: Specifies the plugin used to identify which jobs can be
preempted in order to start a pending job.
* /preempt/none/: Job preemption is disabled (default).
* /preempt/partition_prio/: Job preemption is based upon partition
/PriorityTier/. Jobs in higher PriorityTier partitions may preempt
jobs from lower PriorityTier partitions. This is not compatible
with /PreemptMode=OFF/.
which somewhat make it sounds like all partitions should have
preemption set and not only some? I obviously have some "off"
partitions. However elsewhere in that document it says
*PreemptMode*: Mechanism used to preempt jobs or enable gang
scheduling. When the /PreemptType/ parameter is set to enable
preemption, the /PreemptMode/ in the main section of slurm.conf
selects the default mechanism used to preempt the preemptable jobs for
the cluster.
/PreemptMode/ may be specified on a per partition basis to override
this default value if /PreemptType=preempt/partition_prio/.
which kind of sounds like it should be okay (unless it means
**everything** must be different than OFF). Yet still elsewhere in
that same page it says
On the other hand, if you want to use
/PreemptType=preempt/partition_prio/ to allow jobs from higher
PriorityTier partitions to Suspend jobs from lower PriorityTier
partitions, then you will need overlapping partitions, and
/PreemptMode=SUSPEND,GANG/ to use Gang scheduler to resume the
suspended job(s). In either case, time-slicing won't happen between
jobs on different partitions.
Which somewhat sounds like only suspend and gang can be used as
preemption modes, and not cancel (my preference) or requeue (perhaps
acceptable, if I jump through some hoops).
So to me the documentation is highly confusing about what can or
cannot be used together with what else, and the examples at the bottom
of the page are nice, but they do not specify the full settings.
Particularly this one https://slurm.schedmd.com/preempt.html#example2
is close enough to mine, but it does not tell what PreemptType has
been chosen (nor if "cancel" would be allowed or not in that setup).
Thanks again!
On Fri, Jan 12, 2024 at 7:22 AM Paul Edmon <ped...@cfa.harvard.edu> wrote:
At least in the example you are showing you have PreemptType
commented out, which means it will return the default. PreemptMode
Cancel should work, I don't see anything in the documentation that
indicates it wouldn't. So I suspect you have a typo somewhere in
your conf.
-Paul Edmon-
On 1/11/2024 6:01 PM, Davide DelVento wrote:
I would like to add a preemptable queue to our cluster. Actually
I already have. We simply want jobs submitted to that queue be
preempted if there are no resources available for jobs in other
(high priority) queues. Conceptually very simple, no
conditionals, no choices, just what I wrote.
However it does not work as desired.
This is the relevant part:
grep -i Preemp /opt/slurm/slurm.conf
#PreemptType = preempt/partition_prio
PartitionName=regular DefMemPerCPU=4580 Default=True
Nodes=node[01-12] State=UP PreemptMode=off PriorityTier=200
PartitionName=All DefMemPerCPU=4580 Nodes=node[01-36] State=UP
PreemptMode=off PriorityTier=500
PartitionName=lowpriority DefMemPerCPU=4580 Nodes=node[01-36]
State=UP PreemptMode=cancel PriorityTier=100
That PreemptType setting (now commented) fully breaks slurm,
everything refuses to run with errors like
$ squeue
squeue: error: PreemptType and PreemptMode values incompatible
squeue: fatal: Unable to process configuration file
If I understand correctly the documentation at
https://slurm.schedmd.com/preempt.html that is because preemption
cannot cancel jobs based on partition priority, which (if true)
is really unfortunate. I understand that allowing
cross-partition time-slicing could be tricky and so I understand
why that isn't allowed, but cancelling? Anyway, I have to questions:
1) is that correct and so should I avoid using either partition
priority or cancelling?
2) is there an easy way to trick slurm into requeing and then
have those jobs cancelled instead?
3) I guess the cleanest option would be to implement QoS, but
I've never done it and we don't really need it for anything else
other than this. The documentation looks complicated, but is it?
The great Ole's website is unavailable at the moment...
Thanks!!