On Fri, Jun 19, 2009 at 3:12 AM, Ralph Castain <r...@open-mpi.org> wrote:
> Add --debug-devel to your cmd line and you'll get a bunch of diagnostic > info. Did you configure --enable-debug? If so, then additional debug can be > obtained - can let you know how to get it, if necessary. Yes we had run with the -d flag and it was the output from this that prompted us to find out how to prevent the use of the external network. I am not sure what most of the messages mean but we still get quite a few references to hankel.fred.com which the nodes will not be able to access. Here is the output (changed external ip numbers and domain): [cluster@hankel ~]$ mpirun --debug-devel --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 --mca oob_tcp_if_exclude lo,eth0 -np 1 --host n06 hostname [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] connect_uni: connection not allowed [hankel.fred.com:26997] [0,0,0] setting up session dir with [hankel.fred.com:26997] universe default-universe-26997 [hankel.fred.com:26997] user cluster [hankel.fred.com:26997] host hankel.fred.com [hankel.fred.com:26997] jobid 0 [hankel.fred.com:26997] procid 0 [hankel.fred.com:26997] procdir: /tmp/openmpi-sessions-clus...@hankel.fred.com_0/default-universe-26997/0/0 [hankel.fred.com:26997] jobdir: /tmp/openmpi-sessions-clus...@hankel.fred.com_0/default-universe-26997/0 [hankel.fred.com:26997] unidir: /tmp/openmpi-sessions-clus...@hankel.fred.com_0/default-universe-26997 [hankel.fred.com:26997] top: openmpi-sessions-clus...@hankel.fred.com_0 [hankel.fred.com:26997] tmp: /tmp [hankel.fred.com:26997] [0,0,0] contact_file /tmp/openmpi-sessions-clus...@hankel.fred.com_0 /default-universe-26997/universe-setup.txt [hankel.fred.com:26997] [0,0,0] wrote setup file [hankel.fred.com:26997] pls:rsh: local csh: 0, local sh: 1 [hankel.fred.com:26997] pls:rsh: assuming same remote shell as local shell [hankel.fred.com:26997] pls:rsh: remote csh: 0, remote sh: 1 [hankel.fred.com:26997] pls:rsh: final template argv: [hankel.fred.com:26997] pls:rsh: /usr/bin/ssh <template> orted --debug --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename <template> --universe clus...@hankel.fred.com:default-universe-26997 --nsreplica "0.0.0;tcp://192.168.0.99:54116" --gprreplica "0.0.0;tcp:// 192.168.0.99:54116" [hankel.fred.com:26997] pls:rsh: launching on node n06 [hankel.fred.com:26997] pls:rsh: n06 is a REMOTE node [hankel.fred.com:26997] pls:rsh: executing: (//usr/bin/ssh) /usr/bin/ssh n06 PATH=/usr/lib/openmpi/1.2.7-gcc/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/lib/openmpi/1.2.7-gcc/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /usr/lib/openmpi/1.2.7-gcc/bin/orted --debug --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename n06 --universe clus...@hankel.fred.com:default-universe-26997 --nsreplica "0.0.0;tcp:// 192.168.0.99:54116" --gprreplica "0.0.0;tcp://192.168.0.99:54116" [HOSTNAME= hankel.fred.com TERM=xterm-color SHELL=/bin/bash HISTSIZE=1000 SSH_CLIENT=130.149.86.77 50506 22 SSH_TTY=/dev/pts/12 USER=cluster LD_LIBRARY_PATH=:/usr/lib/openmpi/1.2.7-gcc/lib LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35: MAIL=/var/spool/mail/cluster PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/lib/openmpi/1.2.7-gcc/bin:/home/cluster/bin INPUTRC=/etc/inputrc PWD=/home/cluster LANG=en_US.UTF-8 SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass SHLVL=1 HOME=/home/cluster LOGNAME=cluster CVS_RSH=ssh SSH_CONNECTION=222.222.222.222 50506 111.111.111.111 22 LESSOPEN=|/usr/bin/lesspipe.sh %s G_BROKEN_FILENAMES=1 _=/usr/lib/openmpi/1.2.7-gcc/bin/mpirun OMPI_MCA_orte_debug=1 OMPI_MCA_btl=tcp,self OMPI_MCA_btl_tcp_if_exclude=lo,eth0 OMPI_MCA_oob_tcp_if_exclude=lo,eth0 OMPI_MCA_seed=0]