You probably want the Prolog option:
https://slurm.schedmd.com/slurm.conf.html#OPT_Prolog along with:
https://slurm.schedmd.com/slurm.conf.html#OPT_ForceRequeueOnFail
-Paul Edmon-
On 2/14/2024 8:38 AM, Cutts, Tim via slurm-users wrote:
Hi, I apologise if I’ve failed to find this in the documentation (and
am happy to be told to RTFM) but a recent issue for one of my users
resulted in a question I couldn’t answer.
LSF has a feature called a Pre-Exec where a script executes to check
whether a node is ready to run a task. So, you can run arbitrary
checks and go back to the queue if they fail.
For example, if I have some automounted filesystems, and I want to be
able to check for failure of the automounted, in an LSF world, I can do:
bsub -E “test -f /nfs/someplace/file_I_know_exists” my_job.sh
What’s the equivalent in SLURM?
Thanks,
Tim
--
*Tim Cutts*
Scientific Computing Platform Lead
AstraZeneca
Find out more about R&D IT Data, Analytics & AI and how we can support
you by visiting ourService Catalogue
<https://azcollaboration.sharepoint.com/sites/CMU993>|
------------------------------------------------------------------------
AstraZeneca UK Limited is a company incorporated in England and Wales
with registered number:03674842 and its registered office at 1 Francis
Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named
recipient only and may contain confidential and privileged
information. If they have come to you in error, you must not copy or
show them to anyone; instead, please reply to this e-mail,
highlighting the error to the sender and then immediately delete the
message. For information about how AstraZeneca UK Limited and its
affiliates may process information, personal data and monitor
communications, please see our privacy notice at www.astrazeneca.com
<https://www.astrazeneca.com>
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com