On Jun 28, 2012, at 8:04 PM, Yong Qin wrote: > Thanks to Jeff, we now have a bug registered with the segv issue.
There may be some confusion here with the fact that OMPI supports 2 different MX transports: an MTL and a BTL. Here's what the README says: ----- - Myrinet MX (and Open-MX) support is shared between the 2 internal devices, the MTL and the BTL. The design of the BTL interface in Open MPI assumes that only naive one-sided communication capabilities are provided by the low level communication layers. However, modern communication layers such as Myrinet MX, InfiniPath PSM, or Portals, natively implement highly-optimized two-sided communication semantics. To leverage these capabilities, Open MPI provides the "cm" PML and corresponding MTL components to transfer messages rather than bytes. The MTL interface implements a shorter code path and lets the low-level network library decide which protocol to use (depending on issues such as message length, internal resources and other parameters specific to the underlying interconnect). However, Open MPI cannot currently use multiple MTL modules at once. In the case of the MX MTL, process loopback and on-node shared memory communications are provided by the MX library. Moreover, the current MX MTL does not support message pipelining resulting in lower performances in case of non-contiguous data-types. The "ob1" and "csum" PMLs and BTL components use Open MPI's internal on-node shared memory and process loopback devices for high performance. The BTL interface allows multiple devices to be used simultaneously. For the MX BTL it is recommended that the first segment (which is as a threshold between the eager and the rendezvous protocol) should always be at most 4KB, but there is no further restriction on the size of subsequent fragments. The MX MTL is recommended in the common case for best performance on 10G hardware when most of the data transfers cover contiguous memory layouts. The MX BTL is recommended in all other cases, such as when using multiple interconnects at the same time (including TCP), or transferring non contiguous data-types. ----- If you want to use the MX MTL, it may be simplest to simply remove the MX BTL plugin from your installation directory. That way, it *should* auto-select the MX MTL when you have machines with MX, and when you're on machines that do not have MX but do have OpenFabrics devices, it should auto-select the openib BTL. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/