Hah! Your reply came in seconds after I replied.
Your questions made me notice that we're missing a FAQ entry for the
"ssh:rsh" explanation, though, so I'll add an entry for that. Thanks.
On May 18, 2007, at 5:15 PM, Steven Truong wrote:
Hi, Jeff. Ok. After reading through the FAQ, I modified .bashrc to
set PATH and LD_LIBRARY_PATH and now I could execute:
[struong@neptune ~]$ ssh node07 which orted /usr/local/
openmpi-1.2.1/bin/orted
[struong@neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host node07
hostname node07.nanostellar.com
Thank you.
Steven.
On 5/18/07, Steven Truong <midai...@gmail.com> wrote:
Hi, Jeff. Thanks so very much for all your helps so far. I decided
that I needed to go back and check whether openmpi even works for
simple cases, so here I am.
So my shell might have exited when it detect that I ran
non-interactively. But then again, how this parameter
MCA pls: parameter "pls_rsh_agent" (current value: "ssh :rsh")
affect my outcome? How am I going to set PATH and LD_LIBRARY_PATH to
be like those in .bash_profile in my Torque job files?
Could you give me some tips here?
Below is my current bash shell's settings.
Thanks,
Steven.
[struong@neptune ~]$ echo $SHELL
/bin/bash
[struong@neptune ~]$ cat .bash_profile | grep -v ^#
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
umask 027
PATH=/opt/intel/fce/9.1.043/bin:/usr/local/openmpi-1.2.1/bin:/opt/
c3-4:/opt/bin:/usr/local/torque/bin:/usr/local/torque/sbin:/usr/
local/maui/bin:/usr/local/maui/sbin:/usr/kerberos/sbin:/usr/
kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/
usr/bin:/usr/X11R6/bin:/usr/local/rrdtool-1.2.12/bin:~/bin
BASH_ENV=$HOME/.bashrc
FC=/opt/intel/fce/9.1.043/bin/ifort
F90=$FC
F77=$FC
F77_GETARGDECL=" "
LD_LIBRARY_PATH=/usr/local/openmpi-1.2.1/lib
RSHCOMMAND=/usr/bin/ssh
PBS_DEFAULT="neptune"
PBSLOGLEVEL=7
BUILD_DIR=/tmp/rrdbuil
INSTALL_DIR=/usr/local/rrdtool-1.2.12
source /usr/local/ecce/scripts/runtime_setup.sh
export F77 USERNAME BASH_ENV PATH RSHCOMMAND FC F90 PBS_DEFAULT
BUILD_DIR INSTALL_DIR LD_LIBRARY_PATH
[struong@neptune ~]$ ssh node07 which orted
which: no orted in (/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin)
[struong@neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host
node07
node07 hostname
---------------------------------------------------------------------
-----
Failed to find the following executable:
Host: node07.nanostellar.com
Executable: node07
Cannot continue.
---------------------------------------------------------------------
-----
On 5/18/07, Jeff Squyres <jsquy...@cisco.com> wrote:
On May 18, 2007, at 4:38 PM, Steven Truong wrote:
[struong@neptune 4cpu4npar10nsim]$ mpirun --mca btl tcp,self -np 1
--host node07 hostname
bash: orted: command not found
As you noted later in your mail, this is the key problem: orted is
not found on the remote node.
Notice that you are currently using the rsh launcher, not the Torque
launcher (presumably because you are not inside a Torque job). What
you want to check is:
rsh node07 which orted
(or use ssh -- whatever is correct for your cluster)
I suspect that orted will not be found, and that you'll need to
modify your shell startup files to set PATH / LD_LIBRARY_PATH
properly. Note that some shell startup files will exit early if
they
detect that they are running on a non-interactive login. See
http://
www.open-mpi.org/faq/?category=running#adding-ompi-to-path for more
details.
Alternatively, you can simply use the absolute pathname to mpirun,
which Open MPI will interpret to mean that you want OMPI to set the
PATH/LD_LIBRARY_PATH on the remote node for you. Something like
this:
/usr/local/openmpi-1.2.1/bin/mpirun --host node07 hostname
(note that the "btl" MCA parameter is only relevant for MPI
executables)
--
Jeff Squyres
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems