Hi, > Am 22.12.2016 um 01:47 schrieb Jin Li <lijin....@gmail.com>: > > Hi all, > > Could you help show me how to get the exit code of the running command > submitted to SGE? > > For example, I have two jobs A and B submitted to SGE, and B depends > on A. The job B wants to execute only if commands in the job A exit > successfully. Then how could I get the exit code of the running > commands in the job A? Thanks for your help.
The -hold_jid option will start job B as soon as job A left the cluster. Whether it was completed successfully or not is not taken into account. To implement a workflow which honors the result of A you have several options. First of all, it's desirable to get the exit code of the application also as the result of the complete job, independent of any post processing as you then have the chance to check it also latter when inspecting the accounting of the job: #!/bin/sh mybinary < input > output joberror=$? any_post_processing_you_have_to_perform exit $joberror Then you can check the exit code also with `qacct -j< job_id>`. While it would be possible to do it also in job B, there might be a race condition as the accounting file is written after the job when the job A left the cluster already and hence may not yet exist when job B starts. Whether you use `qacct -j <job_id>` or scan the accounting file (which is a plain text file) manually, it has to loop to check for the existence of an entry for the job in question. In case the exechosts could also be submission hosts, it would be possible that the job A changes settings of job B. Either by removing a hold of this job (in addition to the -hold_jid) or in case it is sensible that the job B starts anyway to set a job context or environment variable for job B to indicate which processing it should perform. As long as you can address the job by names (which can for many commands in SGE used instead of the job number) it might be easy to know already in job A the name of job B. Otherwise it's possible to attach a job context to the job A also at run time (opposite to environment variables). I.e. when you submit job B and you know the job number of this job now, it can be attached to job A by `qalter -ac NEXT_JOB_ID=123456 6437358` while the latter number is the known job number of job A the first one the job id of job B. To script this you may also look into the option "-terse". The NEXT_JOB_ID (or any other name you prefer) can be checked in job A by: qstat -j $JOB_ID | sed -n -e "/^context/s/^context: *//p" | tr "," "\n" just before it exits to take proper action. Please let me know, in case you need further details. -- Reuti _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users