Hi Reuti, all, On 21 April 2017 at 18:21, Reuti <re...@staff.uni-marburg.de> wrote:
> > I want to switch from using qrsh directly to using a wrapper ('qrshx') > > that gives me a session in which all the env vars set in qsub/qsh > > sessions (e.g. JOB_ID) are defined: > > > > $ cat /usr/local/scripts/qrshx > > #!/bin/sh > > # ... > > exec qrsh $( [ -z ${DISPLAY+x} ] || echo '-v DISPLAY' ) -pty y "$@" > $SHELL > > (from https://gist.github.com/willfurnass/10277756070c4f374e6149a2813248 > 41) > > > > However, I find that using qrshx that unless I specify '-now n' I > > don't get a session but attempts to start a qrsh session directly with > > the same resource requests succeed. > > > > [te1st@sharc-login1 ~]$ qrsh -P rse -l gpu=1 > > [te1st@sharc-node126 ~]$ # works > > Without a command, it will go to "qtype INTERACTIVE" > > > [te1st@sharc-login1 ~]$ qrshx -P rse -l gpu=1 > > [te1st@sharc-login1 ~]$ # failed > > This has a command: $SHELL, and will go to "qtype BATCH" > > > [te1st@sharc-login1 ~]$ qrshx -P rse -l gpu=1 -now n > > [te1st@sharc-node126 ~]$ # works > > Same here. > > Do you have more than one queue in the cluster? Do you use a JSV which > could influence this behavior? I've gotten to the bottom of the issue, and it wasn't due to our JSV: `qrshx -P rse -l gpu=1` was triggering a SGE prolog script that took ~5s to run, causing some kind of timeout. I've made the prolog script much faster [1] and now my `qrshx` script works can start interactive sessions. Going to promote qrshx to our users as a nicer alternative to qrsh/qsh. [1] I'm now using /proc/driver/nvidia/gpus to determine the number of NVIDIA GPUs on the execution host as `nvidia-smi -L` is very slow. Cheers, Will -- Dr Will Furnass | Research Software Engineer Dept of Computer Science | University of Sheffield https://rse.shef.ac.uk | @willfurnass | +44 (0)114 22 21872 _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss