On Feb 12, 2007, at 12:54 PM, Matteo Guglielmi wrote:
This is the ifconfig output from the machine I'm used to submit the
parallel job:
It looks like both of your nodes share an IP address:
[root@lcbcpc02 ~]# ifconfig
eth1 Link encap:Ethernet HWaddr 00:15:17:10:53:C9
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:
255.255.255.0
[root@lcbcpc04 ~]# ifconfig
eth1 Link encap:Ethernet HWaddr 00:15:17:10:53:75
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:
255.255.255.0
This will be problematic to more than just OMPI if these two
interfaces are on the same network. The solution is to ensure that
all your nodes have unique IP addresses.
If these NICs are on different networks, than it's a valid network
configuration, but Open MPI (by default) will assume that these are
routable to each other. You can tell Open MPI to not use eth1 in
this case -- see this FAQ entries for details:
http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network
http://www.open-mpi.org/faq/?category=tcp#tcp-selection
http://www.open-mpi.org/faq/?category=tcp#tcp-routability
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems