Thanks for the answer. Unfortunately, none of the workarounds is acceptable in my use case...
For the record, the trac issue is: https://arc.liv.ac.uk/trac/SGE/ticket/576 It looks like it would be possible to remove the notify flag using JGDI, but then I hit this issue: https://arc.liv.ac.uk/trac/SGE/ticket/1605 Julien 2017-03-06 12:29 GMT+01:00 Reuti <re...@staff.uni-marburg.de>: > Hi, > > > Am 06.03.2017 um 10:36 schrieb Julien Nicoulaud < > julien.nicoul...@gmail.com>: > > > > I run jobs with -notify and a long notify time of 30 minutes, as the > jobs can have a very long cleanup. > > This works fine, when using "qdel" USR2 is sent and handled by my jobs. > > > > But in some cases, I would like to force kill the job immediately (by > sending the KILL signal). > > I cannot find any way to do this, any idea ? > > Unfortunately the -notify has no y/n option, and hence we can't change its > setting by `qalter`. There are two similar ways to remove them anyway: > > 1. Abuse a checkpointing interface to kill it by rescheduling it (must be > attached to the queue and requested by job submission). > > $ qconf -sckpt killer > ckpt_name killer > interface userdefined > ckpt_command none > migr_command none > restart_command none > clean_command none > ckpt_dir /tmp > signal none > when x > > The running job can be checkpointed by `qmod -sj <job_id>`, this will send > a sigkill to the job and reschedule it. While it is waiting again, you can > use the usual `qdel` to remove it from the waiting list. > > (2. but not optimal: Submit the jobs with "-r y" and reschedule them by > `qmod -rj <jobn_id>`. While it's waiting again, you can use the `qdel` on > the (again) waiting job. But the jobs will continue on the node although > they vanished from the job list. There were discussions on the list before, > that it will need some time until they really decease operation.) > > -- Reuti >
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users