[slurm-users] Re: Upgrade node while jobs running

2024-08-02 Thread Christopher Samuel via slurm-users
G'day Sid, On 7/31/24 5:02 pm, Sid Young via slurm-users wrote: I've been waiting for node to become idle before upgrading them however some jobs take a long time. If I try to remove all the packages I assume that kills the slurmstep program and with it the job. Are you looking to do a Slurm

[slurm-users] Re: Upgrade node while jobs running

2024-08-02 Thread Sid Young via slurm-users
Thanks Tim, that fits with my observations. I will be back on it on the 13th and see what effects upgrading the required RPMs has. Sid On Sat, 3 Aug 2024, 01:41 Cutts, Tim, wrote: > Generally speaking as a best practice I’d perform such things with no jobs > running, but some upgrades you can a

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

2024-08-02 Thread Davide DelVento via slurm-users
I am pretty sure with vanilla slurm is impossible. What it might be possible (maybe) is submitting 5 core jobs and using some pre-post scripts which immediately before the job start change the requested number of cores to "however are currently available on the node where it is scheduled to run".

[slurm-users] Re: Upgrade node while jobs running

2024-08-02 Thread Cutts, Tim via slurm-users
Generally speaking as a best practice I’d perform such things with no jobs running, but some upgrades you can allow without it. Upgrading a package, even one which is currently in use by a running job, does not necessarily kill the job. For example, upgrading a shared library won’t kill existi

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

2024-08-02 Thread Laura Hild via slurm-users
My read is that Henrique wants to specify a job to require a variable number of CPUs on one node, so that when the job is at the front of the queue, it will run opportunistically on however many happen to be available on a single node as long as there are at least five. I don't personally know

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

2024-08-02 Thread Cutts, Tim via slurm-users
You can’t have both exclusive access to a node and sharing, that makes no sense. You see this on AWS as well – you can select either sharing a physical machine or not. There is no “don’t share if possible, and share otherwise”. Unless you configure SLURM to overcommit CPUs, by definition if yo

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

2024-08-02 Thread Jeffrey Layton via slurm-users
I think all of the replies point to --exclusive being your best solution (only solution?). You need to know exactly the maximum number of cores a particular application or applications will use. Then you allow other applications to use the unused cores. Otherwise, at some point when the applicatio