Generally speaking as a best practice I’d perform such things with no jobs 
running, but some upgrades you can allow without it.  Upgrading a package, even 
one which is currently in use by a running job, does not necessarily kill the 
job.  For example, upgrading a shared library won’t kill existing tasks, since 
they already have an open file handle on the old library version, so they will 
continue to use it.  New processes starting will pick up the new replacement 
version.  Obviously that has some risks, depending on what the job is, 
especially if the behaviour is different and this isn’t just a bug fix release.

I’ve certainly done some security patches in the past on live systems; for 
example upgrading openssh.  You need to take a risk based approach to it.  The 
lowest risk approach is to submit an exclusive job as root to drain the node, 
run the update and then reboot it.  But you might be waiting a long time, which 
is unacceptable for high severity security patches.  The higher risk is to use 
some other mechanism to run the update anyway; ansible, dsh, whatever your 
process is.

Can you cope with the users turning up at your desk with flaming torches and 
pitchforks if it goes wrong?  😊

Regards,

Tim
--
Tim Cutts
Scientific Computing Platform Lead
AstraZeneca

Find out more about R&D IT Data, Analytics & AI and how we can support you by 
visiting our Service 
Catalogue<https://azcollaboration.sharepoint.com/sites/CMU993> |


From: Sid Young via slurm-users <slurm-users@lists.schedmd.com>
Date: Thursday, 1 August 2024 at 1:04 AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Upgrade node while jobs running
G'day all,

I've been waiting for node to become idle before upgrading them however some 
jobs take a long time. If I try to remove all the packages I assume that kills 
the slurmstep program and with it the job.

Sid
________________________________

AstraZeneca UK Limited is a company incorporated in England and Wales with 
registered number:03674842 and its registered office at 1 Francis Crick Avenue, 
Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only 
and may contain confidential and privileged information. If they have come to 
you in error, you must not copy or show them to anyone; instead, please reply 
to this e-mail, highlighting the error to the sender and then immediately 
delete the message. For information about how AstraZeneca UK Limited and its 
affiliates may process information, personal data and monitor communications, 
please see our privacy notice at 
www.astrazeneca.com<https://www.astrazeneca.com>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to