So, I tried out the flag that you mentioned would force the use of loopback 
interface.  It worked without error or stalling:

$ )mpirun --mca oob_tcp_if_include lo0 -np 2 ./hello_cxx
Hello, world!  I am 0 of 2(Open MPI v1.7.3, package: Open MPI 
macpo...@meredithk-mac.corp.fmglobal.com Distribution, ident: 1.7.3, Oct 17, 
2013, 117)
Hello, world!  I am 1 of 2(Open MPI v1.7.3, package: Open MPI 
macpo...@meredithk-mac.corp.fmglobal.com Distribution, ident: 1.7.3, Oct 17, 
2013, 117)

Thanks for all your help!

Karl



On Dec 4, 2013, at 8:23 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:

> On Dec 4, 2013, at 7:25 AM, "Meredith, Karl" <karl.mered...@fmglobal.com> 
> wrote:
> 
>> Before turning off my firewall, I have these rules
>> 
>> $ )sudo ipfw list
>> Password:
>> 05000 allow ip from any to any via lo*
> 
> This is an interesting rule.  Perhaps you can try:
> 
>    mpirun --mca oob_tcp_if_include lo0 ...
> 
> Which would force OMPI to use the loopback interface for TCP connections 
> (it's normally excluded, because it's not viable for off-node 
> communications).  This would only be useful for single-node runs, of course.
> 
>> Our local IT expert believes that this problem is related to this bug from 
>> way back in openmpi 1.2.3, but it seems like the patch was never implemented:
>> http://www.open-mpi.org/community/lists/users/2007/05/3344.php
> 
> No, I don't believe that's the issue.  Here's why:
> 
> - OMPI currently ignores loopback interfaces by default.  This is done 
> because the norm is to have multi-server runs, and loopback interfaces are 
> not useful for such runs.  Put differently: OMPI defaults to using external 
> IP interfaces.
> 
> - However, all your external IP interfaces are firewalled.  So when OMPI 
> tries to make a loopback connection on the external IP interfaces, it's 
> blocked.  Kaboom.  But this makes it easy to understand why when you disable 
> the firewall, it works.
> 
> - That bug report you cited (good research, BTW!) is because we had a problem 
> in parsing the oob_tcp_if_include MCA parameter way back in the 1.2.x series, 
> which has since been fixed.  The user was trying to explicitly tell OMPI "use 
> the lo0 interface" (i.e., override the default of *not* using the lo0 
> interface), and the bug prevented that from working.  That bug has long since 
> been fixed: you can override OMPI's default of not using lo0.  You should 
> then be able to run without disabling your firewall (that's what the mpirun 
> syntax I cited above above is doing).
> 
> - As noted above, using lo0 for multi-server runs is a bad idea; it won't 
> work (OMPI may get confused and think that it can use 127.0.0.0/8 to contact 
> multiple servers, because by the netmask, it hypothetically can).  But you 
> can do it for runs limited to your local laptop with no problem.
> 
> - The real solution, as Ralph implied is to stop using external IP interfaces 
> for single-server control messages (we talked about this off-list).  Let me 
> explain this statement a bit...  OMPI has 2 main channels for communication: 
> a) control messages and b) MPI traffic.  MPI traffic is already smart enough 
> to use shared memory for single-server MPI traffic and some form of network 
> for off-server MPI traffic.  The control message plane doesn't currently make 
> that distinction -- it uses IP interfaces for *all* traffic (and defaults to 
> not using loopback interfaces), regardless of destination.  So the real 
> solution is to make the control message plane a little smarter: put a named 
> unix domain socket in the filesystem on the local server and let local 
> control messages use that (instead of external IP addresses).  FWIW, this is 
> what LAM/MPI used to do; we just never adopted that into Open MPI (LAM/MPI 
> was one of Open MPI's predecessors).
> 
> This feature may take a little time to implement, and may or may not make it 
> into the v1.7.x series.  But you should be able to use the oob_tcp_if_include 
> MCA param in the meantime (see the FAQ for different ways to set MCA params; 
> you can stick it in an environment variable or file instead of manually 
> including it on the mpirun command line all the time, if that's more 
> convenient).
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to