Once more, hello Slurm-Dev,
The problem remains after upgrading to 17.02.6 today. A job submitted
to multiple partitions and pending for Resources has a single priority
which reflects the PriorityJobFactor of the partition that is first in
the list. Is this a bug? I spent a while digging through the bug
tracker and couldn't find anything, although changelog entries for 17.11
might be relevant. Thoughts?
Thank you!
Corey
On 08/11/2017 02:38 PM, Corey Keasling wrote:
Hello again,
Looks like I'll make more definite plans to upgrade. Per the Changelog
for 17.02.3:
-- Fix updating job priority on multiple partitions to be correct.
Corey
--
Corey Keasling
Software Manager
JILA Computing Group
University of Colorado-Boulder
440 UCB Room S244
Boulder, CO 80309-0440
303-492-9643
On 08/11/2017 01:50 PM, Corey Keasling wrote:
Hi Slurm-Dev,
I'm trying to determine how a job's multifactor priority is calculated
when the job is submitted to multiple partitions where each partition
has a different priority factor. I'm running 16.05.6 with ill-defined
plans to move to 17.02.
My cluster is partitioned such that one partition is a subset of another
with the subset having a 10x higher PriorityJobFactor. The intent is to
give greater priority on the subset to the group that purchased it while
allowing all users to run on all nodes. Thus I hope to permit the
privileged group to submit jobs to both partitions simultaneously, but
to have their greater priority apply only to the subset. However, based
on squeue and sprio, this may not be happening.
squeue -P reports identical priorities for both entries (i.e., the same
job but considered for p1 and p2). sprio seems to report the priority
as calculated for the first partition in the list (i.e., if submitted
via sbatch -p1,p2 the job has gets the p1 priority factor, while sbatch
-p2,p1 gives the p2 priority factor).
So what's actually going on under the hood? Does the scheduler
calculate priorities for each (job,partition) pair separately, or only
once?
Thank you for your help!