After using just Fairshare for over a year on our GPU cluster, we
have decided it is not working for us for what we really want
to achieve among our groups. We have decided to look at preemption.
What we want is for users to NOT have a #job/GPU maximum (if they are
only person on the cluster they should be able to use it all), but
if another user comes to the "full" cluster they should immediately
be able to run some jobs. Thus preemption is needed.
In our scheme we want
* users to have N protected GPU jobs that cannot be preempted
where N is the number of GPUs allocated.
* N may not be the same for all users. Some priviledged users get more.
* jobs pending in the queue will have lower
priority dependent on the number of GPUs allocated to running jobs.
Maybe doable somehow with PriorityWeightJobSize though not sure how.
* Jobs over N are subject to preemption (and requeued if --requeue is
given) with shortest running jobs of the user with most unprotected GPUs
preempted first.
* another complication is we have a variety of different GPUs and
users may ask for specific ones which can limit what
unprotected GPU jobs are available for preemption
My first attempt to do this in SLURM was to just create two partitions,
GPU and GPU-req, with different PriorityTier values and the later
partition have PreemptMode=REQUEUE. But N would be set by a MaxTRES on
the first partition and be the same for everyone and we need it to be
INDEPENDENT for each user.
Also users would have to "think" about which partition to submit jobs to.
And users want their longest running "unprotected" job to be able to be
PROMOTED to a "protected" jobs automatically when a "protected" job
finishes. However slurm does not allow running jobs to move between
partitions.
I am trying to figure out QOS pre-emption which might solve the
independent N per user issue but I don't think it will solve the promotion
issue.
Any ideas how this scheme might be possible in SLURM?
Otherwise I might have to write a complicated cron job that
tries to do it all "outside" of SLURM issuing scontrol commands.
---------------------------------------------------------------
Paul Raines http://help.nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street Charlestown, MA 02129 USA
The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Mass General Brigham Compliance
HelpLine at https://www.massgeneralbrigham.org/complianceline
<https://www.massgeneralbrigham.org/complianceline> .
Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com