On Dec 20, 2006, at 6:28 PM, Michael John Hanby wrote:
Howdy, I'm new to cluster administration, MPI and high speed networks.
I've compiled my OpenMPI using these settings:
./configure CC='icc' CXX='icpc' FC='ifort' F77='ifort'
--with-mvapi=/usr/local/topspin
--with-mvapi-libdir=/usr/local/topspin/lib64 --enable-static
--prefix=/share/apps/openmpi/1.1.2
Looks good.
I'm running a Gromacs -np 16 job that I submitting using Sun Grid
Engine
and OpenMPI that's going to run for several hours.
The job was submitted with:
mpirun -np 16 -machinefile machines mdrun ......
I've been asked by the owner of the cluster "How can you prove to me
that this openmpi job is using the Infiniband network?"
At first I thought a simple netstat -an on the compute nodes might
tell
me, however I don't see the Infiniband IP's in the list so I'm
thinking
maybe I need to be looking elsewhere.
That's correct.
Ompi_info reports:
MCA mpool: mvapi (MCA v1.0, API v1.0, Component v1.1.2)
MCA btl: mvapi (MCA v1.0, API v1.0, Component v1.1.2)
Good -- this shows that OMPI was properly compiled with MVAPI support
(have you considered moving to the OpenFabrics/OFED IB stack,
perchance? See this web page for more details: http://www.open-
mpi.org/faq/?category=openfabrics, in particular, http://www.open-
mpi.org/faq/?category=openfabrics#vapi-support).
The usual answer is that you should be able to tell via performance
that OMPI is using IB. You can do some small runs with an MPI
network benchmark application (e.g., NetPIPE) to verify this. Runs
over the IB network will exhibit much better performance than over
the TCP network.
In general, Open MPI checks out what networks are available at run
time and chooses the "best" one(s) to use for MPI traffic. So if it
sees an IB network, it should automatically use it. Additionally, if
OMPI has support for the IB network compiled in (e.g., the mvapi
components) and it *doesn't* find a valid IB network to use at run
time, the mvapi component will complain and tell you that you're
likely to get less performance than you expect and then fail over to
tcp.
So -- no news is good news. :-)
Admittedly, some users advocate that an option to mpirun showing
which networks are being used would be nicer, such as:
shell$ mpirun --show-me-which-network-i'm-using ...
But we haven't gotten around to implementing that because the users
that we [informally] polled said "if you tell me that I'm *not* using
the high speed network, that's good enough." So that's what we
implemented first and never got a round tuit to implement the notices
viewed from the other way around (so to speak). It's on the to-do
list; it's just low priority. Patches would be gratefully
accepted. ;-)
All that being said, you can *force* using the IB network with the
following (also see http://www.open-mpi.org/faq/?
category=openfabrics#ib-btl):
shell$ mpirun --mca btl mvapi,self,sm ...
This tells Open MPI to use the "BTL" (read: MPI point-to-point
network) using the mvapi, sm (shared memory), and self components
("self" is what is used when an MPI application sends to itself).
Similarly, you can force using TCP with:
shell$ mpirun --mca btl tcp,self,sm ...
If you specify the "btl" MCA parameter, OMPI will only use the
components that you tell it to use. Or, you can tell OMPI which
components *not* to use, such as:
shell$ mpirun --mca btl ^tcp ...
Which tells OMPI that it can use any btl component *except* tcp
(i.e., mvapi, sm, self, ...etc.).
Hope this helps!
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems