Hi,

I've found, in ifconfig, that each node has 2 interfaces, eth0 and eth1. I've 
run mpiexec with parameter --mca btl_tcp_if_include eth0 (or eth1) to see if 
there was some issues between nodes. Here are the results :
- node1,node2 works with eth1, not with eth0.
- node1,node3 works with eth1, not with eth0.
- node2,node3 does not work with eth1, but works with eth0.
- node1,node2,node3 works with eth1 (!), not with eth0.
These tests even work with activated firewalls.

Actually, order of nodes is important, as `mpiexec --mca btl_tcp_if_include 
eth0 --host node1,node2 ./ring_c` does not work, but `mpiexec --mca 
btl_tcp_if_include eth0 --host node2,node1 ./ring_c` works. Same thing append 
if I change order when launching the 3 processes (putting node2 at the first 
position). I find that a little bit disturbing, but I guess the network 
configuration is guilty.

Thanks a lot Jeff Squyres, your hints helped me to find the source of the 
problem. As it must often happen, the problem didn't come from OpenMPI but from 
network configuration.
I'll ask my sysadmin to help me configuring the interfaces, so as it to work 
without defining mca parameter.

Thank you one more time.
--
Benjamin Bouvier

________________________________________
> What's the output from ifconfig on all nodes?
>
>--
>Jeff Squyres
>jsquy...@cisco.com
>For corporate legal information go to: 
>http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to