Thanks, Ralph! a) Yes, I know I could use only IB by "--mca btl openib", but just want to make sure I am using IB interfaces. I am seeking an option to mpirun to print out the actual interconnect protocol, like --prot to mpirun in MPICH2.
b) Yes, my default shell is bash, but I run a c-shell script from bash terminal, mpirun is invoked inside this c-shell script. I am using rsh launcher, exactly as your guess. I try different mpirun command in the c-shell, one of them is /path/to/bin/mpirun --mca btl openib --app appfile and mpirun and orted are under /path/to/bin, and necessary libs are under /path/to/lib. I tried the -x, --prefix, and -path, all does not work as expected to propagate the PATH and LD_LIBRARY_PATH, since orted is not found on slave nodes, although it shoud since it on the shared NFS partition. Thanks, Yiguang On Jun 28, 2011, at 9:05 AM, yanyg_at_[hidden] wrote: > Hello All, > > I installed Open MPI 1.4.3 on our new HPC blades, with Infiniband > interconnection. > > My system environments are as: > > 1)uname -a output: > Linux gulftown 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT > 2010 x86_64 x86_64 x86_64 GNU/Linux > > 2) /home is mounted over all nodes, and mpirun is started under > /home/... > > Open MPI and application codes are compiled with intel(R) > compilers V11. Infiniband stack is Mellanox OFED 1.5.2. > > I have two questions about mpirun: > > a) how could I get to know what is the network interconnect > protocol used by the MPI application? > > I specify "--mca btl openib,self,sm,tcp" to mpirun, but I want to > make sure it really uses infiniband interconnect. Why specify tcp if you don't want it used? Just leave that off and it will have no choice but to use IB. > > b) when I run mpirun, I get the following message: > It seems orted is not found on slave nodes. If I set the PATH and > LD_LIBRARY_PATH through --prefix to mpirun, or --path, or -x > options to mpirun, to make the orted and related dynamic libs > available on slave nodes, it does not work as expected from mpirun > manual page. The only working case is that I set PATH and > LD_LIBRARY_PATH in ~/.bashrc for mpirun, and this .bashrc is > invoked by slave nodes too for login shell. I do not want to set PATH > and LD_LIBRARY_PATH in ~/.bashrc, but instead to set options to > mpirun directly. Should work with either prefix or -x options, assuming the right syntax with the latter. I take it your default shell is bash, and that you are using the rsh launcher (as opposed to something like torque)? Are you launching from your default shell, or did you perhaps change shell? Can you send the actual mpirun command you typed?