On Sep 3, 2014, at 9:27 AM, Matt Thompson <fort...@gmail.com> wrote: > Just saw this, sorry. Our srun is indeed a shell script. It seems to be a > wrapper around the regular srun that runs a --task-prolog. What it > does...that's beyond my ken, but I could ask. My guess is that it probably > does something that helps keep our old PBS scripts running (sets > $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The > admins would, of course, prefer all future scripts be SLURM-native scripts, > but there are a lot of production runs that uses many, many PBS scripts. > Converting that would need slow, careful QC to make sure any "pure SLURM" > versions act as expected.
Ralph and I haven't had a chance to discuss this in detail yet, but I have thought about this quite a bit. What is happening is that one of the $argv OMPI passes is of the form "foo;bar". Your srun script is interpreting the ";" as the end of the command the the "bar" as the beginning of a new command, and mayhem ensues. Basically, your srun script is violating what should be a very safe assumption: that the $argv we pass to it will not be interpreted by a shell. Put differently: your "srun" script behaves differently than SLURM's "srun" executable. This violates OMPI's expectations of how srun should behave. My $0.02 is that if we "fix" this in OMPI, we're effectively penalizing all other SLURM installations out there that *don't* violate this assumption (i.e., all of them). Ralph may disagree with me on this point, BTW -- like I said, we haven't talked about this in detail since Tuesday. :-) So here's my question: is there any chance you can change your "srun" script to a script language that doesn't recombine $argv? This is a common problem, actually -- sh/csh/etc. script languages tend to recombine $argv, but other languages such as perl and python do not (e.g., http://stackoverflow.com/questions/6981533/how-to-preserve-single-and-double-quotes-in-shell-script-arguments-without-the-a). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/