G'day Sid,
On 7/31/24 5:02 pm, Sid Young via slurm-users wrote:
I've been waiting for node to become idle before upgrading them however
some jobs take a long time. If I try to remove all the packages I assume
that kills the slurmstep program and with it the job.
Are you looking to do a Slurm
Thanks Tim, that fits with my observations. I will be back on it on the
13th and see what effects upgrading the required RPMs has.
Sid
On Sat, 3 Aug 2024, 01:41 Cutts, Tim, wrote:
> Generally speaking as a best practice I’d perform such things with no jobs
> running, but some upgrades you can a
I am pretty sure with vanilla slurm is impossible.
What it might be possible (maybe) is submitting 5 core jobs and using some
pre-post scripts which immediately before the job start change the
requested number of cores to "however are currently available on the node
where it is scheduled to run".
Generally speaking as a best practice I’d perform such things with no jobs
running, but some upgrades you can allow without it. Upgrading a package, even
one which is currently in use by a running job, does not necessarily kill the
job. For example, upgrading a shared library won’t kill existi
My read is that Henrique wants to specify a job to require a variable number of
CPUs on one node, so that when the job is at the front of the queue, it will
run opportunistically on however many happen to be available on a single node
as long as there are at least five.
I don't personally know
You can’t have both exclusive access to a node and sharing, that makes no
sense. You see this on AWS as well – you can select either sharing a physical
machine or not. There is no “don’t share if possible, and share otherwise”.
Unless you configure SLURM to overcommit CPUs, by definition if yo
I think all of the replies point to --exclusive being your best solution
(only solution?).
You need to know exactly the maximum number of cores a particular
application or applications will use. Then you allow other applications to
use the unused cores. Otherwise, at some point when the applicatio