Ah, if it's perl, it might be easy. It might just be the difference between 
system("...string...") and system(@argv).

Sent from my phone. No type good.

On Sep 4, 2014, at 8:35 AM, "Matt Thompson" 
<fort...@gmail.com<mailto:fort...@gmail.com>> wrote:

Jeff,

I actually misspoke earlier. It turns out our srun is a *Perl* script around 
the SLURM srun. I'll speak with our admins to see if they can massage the 
script to not interpret the arguments. If possible, I'll ask them if I can 
share the script with you (privately or on the list) and maybe you can see how 
it is affecting Open MPI's argument passage.

Matt


On Thu, Sep 4, 2014 at 8:04 AM, Jeff Squyres (jsquyres) 
<jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote:
On Sep 3, 2014, at 9:27 AM, Matt Thompson 
<fort...@gmail.com<mailto:fort...@gmail.com>> wrote:

> Just saw this, sorry. Our srun is indeed a shell script. It seems to be a 
> wrapper around the regular srun that runs a --task-prolog. What it 
> does...that's beyond my ken, but I could ask. My guess is that it probably 
> does something that helps keep our old PBS scripts running (sets 
> $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The 
> admins would, of course, prefer all future scripts be SLURM-native scripts, 
> but there are a lot of production runs that uses many, many PBS scripts. 
> Converting that would need slow, careful QC to make sure any "pure SLURM" 
> versions act as expected.

Ralph and I haven't had a chance to discuss this in detail yet, but I have 
thought about this quite a bit.

What is happening is that one of the $argv OMPI passes is of the form 
"foo;bar".  Your srun script is interpreting the ";" as the end of the command 
the the "bar" as the beginning of a new command, and mayhem ensues.

Basically, your srun script is violating what should be a very safe assumption: 
that the $argv we pass to it will not be interpreted by a shell.  Put 
differently: your "srun" script behaves differently than SLURM's "srun" 
executable.  This violates OMPI's expectations of how srun should behave.

My $0.02 is that if we "fix" this in OMPI, we're effectively penalizing all 
other SLURM installations out there that *don't* violate this assumption (i.e., 
all of them).  Ralph may disagree with me on this point, BTW -- like I said, we 
haven't talked about this in detail since Tuesday.  :-)

So here's my question: is there any chance you can change your "srun" script to 
a script language that doesn't recombine $argv?  This is a common problem, 
actually -- sh/csh/etc. script languages tend to recombine $argv, but other 
languages such as perl and python do not (e.g., 
http://stackoverflow.com/questions/6981533/how-to-preserve-single-and-double-quotes-in-shell-script-arguments-without-the-a).

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/09/25263.php



--
"And, isn't sanity really just a one-trick pony anyway? I mean all you
 get is one trick: rational thinking. But when you're good and crazy,
 oooh, oooh, oooh, the sky is the limit!" -- The Tick

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/09/25264.php

Reply via email to