Add --debug-devel to your cmd line and you'll get a bunch of diagnostic info. Did you configure --enable-debug? If so, then additional debug can be obtained - can let you know how to get it, if necessary. Ralph
On Thu, Jun 18, 2009 at 3:49 PM, Honest Guvnor <honestguv...@googlemail.com>wrote: > OpenMPI 1.2.7, ethernet, Centos 5.3 i386 fresh install on host and nodes. > > Despite ssh and pdsh working, mpirun hangs when launching a program > from the host to a node: > > [cluster@hankel ~]$ ssh n06 hostname > n06 > [cluster@hankel ~]$ pdsh -w n06 hostname > n06: n06 > [cluster@hankel ~]$ mpirun -np 1 --host n06 hostname > [HANGS] > > However, mpirun works fine in reverse: > > [cluster@n06 ~]$ mpirun -np 1 --host hankel date > Thu Jun 18 22:53:27 CEST 2009 > > and from node to node. Paths to bin and lib seem OK: > > [cluster@hankel ~]$ printenv PATH > > /usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/lib/openmpi/1.2.7-gcc/bin:/home/cluster/bin > [cluster@hankel ~]$ printenv LD_LIBRARY_PATH > :/usr/lib/openmpi/1.2.7-gcc/lib > [cluster@hankel ~]$ ssh n06 printenv PATH > > /usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/lib/openmpi/1.2.7-gcc/bin > [cluster@hankel ~]$ ssh n06 printenv LD_LIBRARY_PATH > :/usr/lib/openmpi/1.2.7-gcc/lib > > We are new to openmpi but checked a few mca parameters and turned on a > diagnostic flag or two but without coming up with much. The nodes do > not have access to the hosts external network and we half convinced > ourselves this was the problem because of mentions in the output with > the -d flag but: > > [cluster@hankel ~]$ mpirun --mca btl tcp,self --mca btl_tcp_if_exclude > lo,eth0 --mca oob_tcp_if_exclude lo,eth0 -np 1 --host n06 hostname > [STILL HANGS] > > where eth0 is the external network. > > Suggestions gratefully received on how we can get openmpi to report > what has failed or where to poke and prod further? > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >