On Jul 6, 2010, at 5:41 PM, Robert Walters wrote:

> Thanks for your expeditious responses, Ralph.
> 
> Just to confirm with you, I should change openmpi-mca-params.conf to include:
> 
> oob_tcp_port_min_v4 = (My minimum port in the range)
> oob_tcp_port_range_v4 = (My port range)
> btl_tcp_port_min_v4 = (My minimum port in the range)
> btl_tcp_port_range_v4 = (My port range)
> 
> correct?

That should do ya.  Use the same values on all nodes.  You should be able to 
confirm that OMPI's run-time system is working if you are able to mpirun a 
non-MPI program like "hostname" or somesuch.  If that works, then the daemons 
are launching, talking to each other, launching the app, shuttling the I/O 
around, noticing that the app is dying, tidying everything up, and telling 
mpirun that everything is done.  In short: lots of things are happening right 
if you're able to mpirun "hostname" across multiple hosts.

> Also, for a cluster of around 32-64 processes (8 processors per node), how 
> wide of a range will I require? I've noticed some entries in the mailing list 
> suggesting you need a few to get started and then it opens as necessary. Will 
> I be safe with 20 or should I go for 100? 

If you have 64 hosts, each with 8 processors, meaning that the largest MPI job 
you would run would be 64 * 8 = 512 MPI processes, then I'd ask for at least 
1024 -- 2048 would be better (you have a zillion ports; better to ask for more 
than you need).  We recently found a bug in the TCP BTL where it *may* use 2 
sockets for each peerwise connection in some cases.

Additionally, your sysadmin *might* be more amenable to opening up ports *only 
between the cluster nodes* (vs. opening up the ports to anything).  If that's 
the case, you might as well go for the gold and ask them if they can open up 
*all* the ports between all your nodes (while still rejecting everything from 
non-cluster nodes).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to