The real solution is to evict the private addresses from both levels (MPI and 
ORTE). However, based on the ordering of the interfaces, I guess you cannot do 
it by name (eth0 has a private address on one side but a public one on the 
other).

No panic! There is support for this.

Look at the output of "ompi_info --param btw tcp" attached below:

>  MCA btl: parameter "btl_tcp_if_include" (current value: <none>, data
>           source: default value)
>           Comma-delimited list of devices or CIDR notation of networks
>           to use for MPI communication (e.g., "eth0,eth1" or
>           "192.168.0.0/16,10.1.4.0/24").  Mutually exclusive with
>           btl_tcp_if_exclude.
>  MCA btl: parameter "btl_tcp_if_exclude" (current value: <lo,sppp>, data
>           source: default value)
>           Comma-delimited list of devices or CIDR notation of networks
>           to NOT use for MPI communication -- all devices not matching
>           these specifications will be used (e.g., "eth0,eth1" or
>           "192.168.0.0/16,10.1.4.0/24").  Mutually exclusive with
>           btl_tcp_if_include.

You can use the [btl|oob]_tcp_if_[include|exclude] either with names or with IP 
ranges. Add the following to your mpirun:

--mca btl_tcp_if_include "210.0.0.0/8" --mca oob_tcp_if_include "210.0.0.0/8"

and everything should work in all cases.

  george.

On Oct 5, 2011, at 12:13 , Jeff Squyres wrote:

> Check out this FAQ entry:
> 
>    http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> 
> Note that there are btl_tcp_if_include / btl_tcp_if_exclude: these control 
> MPI-level communications.  There's also oob_tcp_if_include / 
> oob_tcp_if_exclude (that take the same kinds of values as 
> btl_tcp_if_include/exclude) that control OMPI's run-time environment 
> communications.
> 
> 
> On Oct 5, 2011, at 12:01 PM, (.-=Kiwi=-.) wrote:
> 
>> "OMPI always tries to use the lowest numbered address first - just a natural 
>> ordering."
>> 
>> That doesn't seem to be the reason. We changed the private IPs to 212... (a 
>> higher number than the public 210... IPs) and still MPI tries to go to 212 
>> afterwards.
>> 
>> We're reading the oob_tcp and btl_tcp parameters but we're not sure how to 
>> do it.
>> 
>> "But if hello world doesn't even run, then try running with "mpirun --mca 
>> oob_tcp_if_include <the interface(s) you want to use> ...", per Ralph's 
>> suggestion.  If *that* doesn't work, also add "--mca btl_tcp_if_include ..." 
>> as well."
>> 
>> We tried doing from Computer 1:
>> 
>> orterun -mca oob_tcp_debug 1 -np 1 -host 212...3 ifconfig
>> 
>> and everything was ok
>> 
>> We tried doing from Computer 1:
>> 
>> orterun -mca oob_tcp_debug 1 -np 1 -host 210...101 ifconfig
>> 
>> and it says:
>> 
>> There are no allocated resources for the application 
>>  ifconfig
>> that match the requested mapping:
>> 
>> 
>> Verify that you have mapped the allocated resources properly using the 
>> --host or --hostfile specification.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
>> launch so we are aborting. [...]  
>> 
>> Any other ideas?
>> 
>> 
>> On Wed, Oct 5, 2011 at 1:54 AM, Ralph Castain <rhc.open...@gmail.com> wrote:
>> OMPI always tries to use the lowest numbered address first - just a natural 
>> ordering. You need to tell it to use just the public ones for this topology. 
>> Use the oob_tcp and btl_tcp parameters to do this. See "ompi_info --param 
>> oob tcp" and "ompi_info --param btl tcp" for the exact syntax.
>> 
>> 
>> Sent from my iPad
>> 
>> On Oct 4, 2011, at 10:21 AM, "(.-=Kiwi=-.)" <heffe...@gmail.com> wrote:
>> 
>>> We are constructing a set of computers with Open MPI and there's a small 
>>> problem with mixing public and private IPs.
>>> 
>>> We aren't sure about what's causing the problem or how to solve it.
>>> 
>>> The files are shared thanks to NFS and we have a couple computers with 
>>> private IPs and public IPs that we want them to send MPI work to some 
>>> machines that have public IPs.
>>> 
>>> I'm going to try to describe with example IPs.
>>> 
>>> Computer 1 sees itself as eth0:  172...2  but has a public IP assigned:  
>>> 210...2
>>> Computer 2 sees itself as eth0:  172...3  but has a public IP assigned:  
>>> 210...3
>>> Computers outside the subnet directly have public IPs assigned:  210...100+
>>> 
>>> The computers outside see Computer 1 and 2 only with 210... they can't see 
>>> the 172... internal IPs.
>>> 
>>> If an outside computer launches mpirun to Computer 1, it works ok.
>>> If Computer 1 tries to launch mpirun to Computer 2 (with 172...) it also 
>>> works ok (not with 210... because they don't know that that's their public 
>>> IP, but that's not an issue).
>>> 
>>> The problem comes when Computer 1 or 2 try to launch mpirun to outside 
>>> computers.
>>> 
>>> We tried to check out what was happening and installed wireshark on an 
>>> outside computer and it seems that the ssh part works ok (the ssh talk 
>>> between 210...2 and 210...101 is ok), but after that the outside computer 
>>> tries to send a TCP SYN package to 172...2 instead of 210...2 and the rest 
>>> of the packets onward the same.
>>> 
>>> Is there a way to solve this problem?
>>> 
>>> I've read this ( 
>>> http://www.open-mpi.org/community/lists/users/2009/11/11184.php ) but I'm 
>>> not really sure what he's doing there.
>>> 
>>> We have the option of plugging Computer 1 and Computer 2 directly to the 
>>> switch that the outside computers are on, but we'd rather not because we'd 
>>> prefer the computers to stay on the private network, but if there's no 
>>> other way, I guess we can.
>>> 
>>> Can it be done without having to change the network topology?
>>> 
>>> Thanks in advance.
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to