On Dec 20, 2006, at 6:28 PM, Michael John Hanby wrote:

Howdy, I'm new to cluster administration, MPI and high speed networks.

I've compiled my OpenMPI using these settings:

./configure CC='icc' CXX='icpc' FC='ifort' F77='ifort'
--with-mvapi=/usr/local/topspin
--with-mvapi-libdir=/usr/local/topspin/lib64 --enable-static
--prefix=/share/apps/openmpi/1.1.2

Looks good.

I'm running a Gromacs -np 16 job that I submitting using Sun Grid Engine
and OpenMPI that's going to run for several hours.

The job was submitted with:
mpirun -np 16 -machinefile machines mdrun ......

I've been asked by the owner of the cluster "How can you prove to me
that this openmpi job is using the Infiniband network?"

At first I thought a simple netstat -an on the compute nodes might tell me, however I don't see the Infiniband IP's in the list so I'm thinking
maybe I need to be looking elsewhere.

That's correct.

Ompi_info reports:
  MCA mpool: mvapi (MCA v1.0, API v1.0, Component v1.1.2)
  MCA btl: mvapi (MCA v1.0, API v1.0, Component v1.1.2)

Good -- this shows that OMPI was properly compiled with MVAPI support (have you considered moving to the OpenFabrics/OFED IB stack, perchance? See this web page for more details: http://www.open- mpi.org/faq/?category=openfabrics, in particular, http://www.open- mpi.org/faq/?category=openfabrics#vapi-support).

The usual answer is that you should be able to tell via performance that OMPI is using IB. You can do some small runs with an MPI network benchmark application (e.g., NetPIPE) to verify this. Runs over the IB network will exhibit much better performance than over the TCP network.

In general, Open MPI checks out what networks are available at run time and chooses the "best" one(s) to use for MPI traffic. So if it sees an IB network, it should automatically use it. Additionally, if OMPI has support for the IB network compiled in (e.g., the mvapi components) and it *doesn't* find a valid IB network to use at run time, the mvapi component will complain and tell you that you're likely to get less performance than you expect and then fail over to tcp.

So -- no news is good news.  :-)

Admittedly, some users advocate that an option to mpirun showing which networks are being used would be nicer, such as:

  shell$ mpirun --show-me-which-network-i'm-using ...

But we haven't gotten around to implementing that because the users that we [informally] polled said "if you tell me that I'm *not* using the high speed network, that's good enough." So that's what we implemented first and never got a round tuit to implement the notices viewed from the other way around (so to speak). It's on the to-do list; it's just low priority. Patches would be gratefully accepted. ;-)

All that being said, you can *force* using the IB network with the following (also see http://www.open-mpi.org/faq/? category=openfabrics#ib-btl):

  shell$ mpirun --mca btl mvapi,self,sm ...

This tells Open MPI to use the "BTL" (read: MPI point-to-point network) using the mvapi, sm (shared memory), and self components ("self" is what is used when an MPI application sends to itself). Similarly, you can force using TCP with:

  shell$ mpirun --mca btl tcp,self,sm ...

If you specify the "btl" MCA parameter, OMPI will only use the components that you tell it to use. Or, you can tell OMPI which components *not* to use, such as:

  shell$ mpirun --mca btl ^tcp ...

Which tells OMPI that it can use any btl component *except* tcp (i.e., mvapi, sm, self, ...etc.).

Hope this helps!

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Reply via email to