On Sep 3, 2014, at 9:27 AM, Matt Thompson <fort...@gmail.com> wrote:

> Just saw this, sorry. Our srun is indeed a shell script. It seems to be a 
> wrapper around the regular srun that runs a --task-prolog. What it 
> does...that's beyond my ken, but I could ask. My guess is that it probably 
> does something that helps keep our old PBS scripts running (sets 
> $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The 
> admins would, of course, prefer all future scripts be SLURM-native scripts, 
> but there are a lot of production runs that uses many, many PBS scripts. 
> Converting that would need slow, careful QC to make sure any "pure SLURM" 
> versions act as expected.

Ralph and I haven't had a chance to discuss this in detail yet, but I have 
thought about this quite a bit.

What is happening is that one of the $argv OMPI passes is of the form 
"foo;bar".  Your srun script is interpreting the ";" as the end of the command 
the the "bar" as the beginning of a new command, and mayhem ensues.

Basically, your srun script is violating what should be a very safe assumption: 
that the $argv we pass to it will not be interpreted by a shell.  Put 
differently: your "srun" script behaves differently than SLURM's "srun" 
executable.  This violates OMPI's expectations of how srun should behave.

My $0.02 is that if we "fix" this in OMPI, we're effectively penalizing all 
other SLURM installations out there that *don't* violate this assumption (i.e., 
all of them).  Ralph may disagree with me on this point, BTW -- like I said, we 
haven't talked about this in detail since Tuesday.  :-)

So here's my question: is there any chance you can change your "srun" script to 
a script language that doesn't recombine $argv?  This is a common problem, 
actually -- sh/csh/etc. script languages tend to recombine $argv, but other 
languages such as perl and python do not (e.g., 
http://stackoverflow.com/questions/6981533/how-to-preserve-single-and-double-quotes-in-shell-script-arguments-without-the-a).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to