SIGUSR1 will be sent to your script prior to SIGSTOP and SIGUSR2 will be sent to your script prior to SIGKILL if you choose the '-notify' option for you qsub process.
You can then trap on SIGUSR1 and SIGUSR2 within your script and output an error condition to file. You can also use "breadcrumbs", that is, echo statements at each stage of your script indicating progress. If a job is killed then the your output file will be incomplete, which is an indication that your job was killed. Unfortunately, placing a hold on a completion script won't work if your pipeline scripts exit with anything other than exit 0. However, if SIGUSR1/USR2 can be trapped then you can exit gracefully, even if SGE decides to kill your job. Writing to a single file or socket requires serialization (implied locking), otherwise you'll get garbage as processes trip over each other trying to write to the descriptor. Serialization is not an issue if each process writes output/error to its own file in a directory. John. On Thu, Jul 17, 2014 at 9:14 AM, Paolo Di Tommaso <paolo.ditomm...@gmail.com> wrote: > Let elaborate more my use case, my main goal is being able to get notified > when a job is killed by the SGE because some hard resource limit is > overcome. > > Since I'm submitting many jobs, programmatically, by using an external tool, > I would need a mechanism to get notified when jobs terminate and above all > when some of them are killed. For this reason I would need a strategy other > than a email message, that is not useful in this scenario. > > Since the SGE kill jobs by sending a SIGTERM signal, there's no way to > intercept it in the job script. So I can't invoke the mailer from it or > implement any other strategy there. > > Alternatives could be to use a epilog script or the "qacct" command, but > unfortunately both of them are not available in SGE cluster of my institute. > > Thus, a custom mailer script is the only available option, I could use it to > write the job notification to a file or even better to a socket. But I would > need to do that at user level, I mean only for jobs submitted from my > environment, without changing the default mailer for the other cluster > users. > > For this reason I'm wondering if the mailer SGE configuration can be defined > in the user environment. I'm not sure but I seem to remember that it is > possibile to define the sge_conf file somewhere in the user $HOME directory. > Any clue about that? > > Thanks, > Paolo > > > > On Thu, Jul 17, 2014 at 2:27 PM, John Kloss <john.kl...@gmail.com> wrote: >> >> The default mailer for SGE is /usr/bin/mail. There is nothing really >> stopping you from forgoing the -M and/or the -m options and instead >> calling /usr/bin/mail or whatever else you like from within your >> script. >> >> That's what I was trying to indicate before, though I wasn't >> particularly clear on that point. Much of the customization that you >> would like can take place within your pipeline scripts and outside of >> changing the configuration of SGE. >> >> John. >> >> On Thu, Jul 17, 2014 at 6:20 AM, Paolo Di Tommaso >> <paolo.ditomm...@gmail.com> wrote: >> > This looks interesting. Is it possible to override the mailer property >> > at >> > user level. >> > >> > I mean I would like to define my own mailer script without affecting the >> > other cluster users. >> > >> > >> > Cheers, >> > Paolo >> > >> > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users