Fernando Lemos ecrivait le 23/03/2010 16:28: >> I'm trying to run openmpi (1.4.1) on two clusters; on each cluster, several >> interfaces are private; >> >> on cluster1, nodes have 3 interfaces, and only 192.168.159.0/24 is visible >> from cluster2. >> >> chicon-3 >> eth0 inet addr:192.168.160.76 Bcast:192.168.160.255 Mask:255.255.255.0 >> eth1 inet addr:192.168.159.76 Bcast:192.168.159.255 Mask:255.255.255.0 >> myri0 inet addr:192.168.162.76 Bcast:192.168.162.255 Mask:255.255.255.0 >> >> on cluster2, nodes have 3 interfaces, and only 172.24.110.0/17 is visible >> from cluster1 >> >> netgdx-8 >> eth0 inet addr:172.24.190.8 Bcast:172.24.191.255 Mask:255.255.192.0 >> eth1 inet addr:172.24.110.8 Bcast:172.24.127.255 Mask:255.255.128.0 >> eth2 inet addr:172.24.240.8 Bcast:172.24.255.255 Mask:255.255.192.0 >> >> so i'm using this to declare all the other networks as private: >> >> mpirun -machinefile ~/gridnodes --mca opal_net_private_ipv4 >> "192.168.162.0/24\;192.168.160.0/24\;172.24.192.0/18\;172.24.128.0/18" >> ./alltoall >> >> but this doesn't work: > > Have you tried -mca btl_tcp_if_include/exclude?
I can't do that because the "public" interface is not always eth1 as in this example (i have several other clusters with different network configurations in my setup) >> Why openmpi tries to connect different private networks, given that >> "public" networks exists ? is it a bug or am i missing something ? > >>From what I've seen, I believe OpenMPI tries to find the fastest route > to the nodes. In some cases it's trivial to sort that out, in other > cases you might need to give it some hints. yes, so i thought that "opal_net_private_ipv4" was the right thing for me; but it doesn't work without the patch. -- Nicolas NICLAUSSE Service DREAM INRIA Sophia Antipolis http://www-sop.inria.fr/ 2004 route des lucioles - BP 93 Tel: (33/0) 4 92 38 76 93 06902 SOPHIA-ANTIPOLIS cedex (France) Fax: (33/0) 4 92 38 76 02
smime.p7s
Description: S/MIME Cryptographic Signature