On Jun 28, 2012, at 8:04 PM, Yong Qin wrote:

> Thanks to Jeff, we now have a bug registered with the segv issue.

There may be some confusion here with the fact that OMPI supports 2 different 
MX transports: an MTL and a BTL.  Here's what the README says:

-----
- Myrinet MX (and Open-MX) support is shared between the 2 internal
  devices, the MTL and the BTL.  The design of the BTL interface in
  Open MPI assumes that only naive one-sided communication
  capabilities are provided by the low level communication layers.
  However, modern communication layers such as Myrinet MX, InfiniPath
  PSM, or Portals, natively implement highly-optimized two-sided
  communication semantics.  To leverage these capabilities, Open MPI
  provides the "cm" PML and corresponding MTL components to transfer
  messages rather than bytes.  The MTL interface implements a shorter
  code path and lets the low-level network library decide which
  protocol to use (depending on issues such as message length,
  internal resources and other parameters specific to the underlying
  interconnect).  However, Open MPI cannot currently use multiple MTL
  modules at once.  In the case of the MX MTL, process loopback and
  on-node shared memory communications are provided by the MX library.
  Moreover, the current MX MTL does not support message pipelining
  resulting in lower performances in case of non-contiguous
  data-types.

  The "ob1" and "csum" PMLs and BTL components use Open MPI's internal
  on-node shared memory and process loopback devices for high
  performance.  The BTL interface allows multiple devices to be used
  simultaneously.  For the MX BTL it is recommended that the first
  segment (which is as a threshold between the eager and the
  rendezvous protocol) should always be at most 4KB, but there is no
  further restriction on the size of subsequent fragments.

  The MX MTL is recommended in the common case for best performance on
  10G hardware when most of the data transfers cover contiguous memory
  layouts.  The MX BTL is recommended in all other cases, such as when
  using multiple interconnects at the same time (including TCP), or
  transferring non contiguous data-types.
-----

If you want to use the MX MTL, it may be simplest to simply remove the MX BTL 
plugin from your installation directory.  That way, it *should* auto-select the 
MX MTL when you have machines with MX, and when you're on machines that do not 
have MX but do have OpenFabrics devices, it should auto-select the openib BTL.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to