On Jun 24, 2006, at 1:19 PM, George Bosilca wrote:

As your cluster have several network devices that are supported by
Open MPI it is possible that the configure script detected the
correct path to their libraries. Therefore, they might be included/
compiled by default in Open MPI. The simplest way to check is to use
the ompi_info tool. "ompi_info | grep btl" will list all the network
devices supported by your particular build.

If several devices (called BTL in Open MPI terms) are compiled in,
only forcing one eth interface for the TCP BTL is not enough. You
should specify that you want only the TCP BTL to be used, forcing
Open MPI to unload/ignore all other available BTL. Add "--mca btl
tcp,self" to your mpirun command and the problem should be solved.

I've looked through the documentation but I haven't found the discussion about what each BTL device is, for example, I have:

MCA btl: self (MCA v1.0, API v1.0, Component v1.2)
MCA btl: sm (MCA v1.0, API v1.0, Component v1.2)
MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)

I found a PDF presentation that describes a few:

• tcp - TCP/IP
• openib – Infiniband OpenIB Stack
• gm/mx- Myrinet GM/MX
• mvapi - Infiniband Mellanox Verbs
• sm - Shared Memory

Are there any others I may see when interacting with other people's computers?

I assume that if a machine has Myrinet and I don't see MCA btl: gm or MCA btl: mx then I have to explain the problem to the sysadm's.

The second question is should I see both gm & mx, or only one or the other.

Michael

Reply via email to