Hi Roger,

instead of canceling the job normally, the user could use "scancel --signal=<signal_name> 
<jobid>" in order to send a specific signal to his jobscript or application.

His application/jobscript now should be able to handle that signal and after 
performing the cleanup tasks it can terminate its self gracefully.

The Epilog scripts are something else now. There are Prolog/Epilog scripts that 
run as root, there are Task-Prolog/Epilog script that run as normal user but 
are defined system-wide by sysadmins and there the per-srun user prolog/epilog 
scripts defined with srun's option --task-prolog and --task-epilog which should 
be short otherwise killed.

Also you could have a look into slurm.conf options like: KillWait, WaitTime and 
all the Timeouts (not all of them are useful for your case though).

Best Regards,
Chrysovalantis Paschoulas


On 12.04.2017 23:40, Roger Moye wrote:
I have an unusual configuration question for the Slurm community.

When a user cancels a job he sees this message:
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.

Is there a way to lengthen this time?

One of our users has constructed his jobs such that when a job is cancelled, it 
tries to perform some cleanup tasks before exiting but these tasks take longer 
than 2 seconds.

Is there a way to lengthen the time a job will continue to run even after it 
has been cancelled?   I guess this would be long enough to allow an Epilog 
script to run.

Thanks in advance!
-Roger


[cid:image001.png@01D22319.C7D5D540]
Roger Moye
HPC Engineer
713.425.6236 Office
713.376.2540 Mobile

QUANTLAB Financial, LLC
3 Greenway Plaza
Suite 200
Houston, Texas 77046
www.quantlab.com<https://www.quantlab.com/>


-----------------------------------------------------------------------------------

The information in this communication and any attachment is confidential and 
intended solely for the attention and use of the named addressee(s). All 
information and opinions expressed herein are subject to change without notice. 
This communication is not to be construed as an offer to sell or the 
solicitation of an offer to buy any security. Any such offer or solicitation 
can only be made by means of the delivery of a confidential private offering 
memorandum (which should be carefully reviewed for a complete description of 
investment strategies and risks). Any reliance one may place on the accuracy or 
validity of this information is at their own risk. Past performance is not 
necessarily indicative of the future results of an investment. All figures are 
estimated and unaudited unless otherwise noted. If you are not the intended 
recipient, or a person responsible for delivering this to the intended 
recipient, you are not authorized to and must not disclose, copy, distribute, 
or retain this message or any part of it. In this case, please notify the 
sender immediately at 713-333-5440



------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

Reply via email to