Re: [OMPI users] btl_openib_rd_{num, low} parameters? (Was Re: ConnectX hang with 1.2.5, crash with 1.3, during gather)

Jeff Squyres Fri, 11 Apr 2008 19:31:25 -0400

On Apr 4, 2008, at 2:47 PM, Matt Hughes wrote:

I was able to eliminate the hang I was seeing with 1.2.5 during the
gather operation by using these btl parameters (found at

http://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mca/btl/openib/btl-openib-benchmark):


btl_openib_max_btls=20
btl_openib_rd_num=128
btl_openib_rd_low=75
btl_openib_rd_win=50
btl_openib_max_eager_rdma=32
mpool_base_use_mem_hooks=1
mpi_leave_pinned=1

Only the btl_openib_rd_low=75 and btl_openib_rd_num=128 parameters are
necessary to avoid the hang.

The information given for the parameters in ompi_info is not very
helpful.  Can anyone explain (or point me to a reference) what these
parameters do and how they affect collective operations?



Yes (btl_openib_ prefix omitted for brevity):

max_btls: the maximum number of active IB ports that Open MPI will usein each MPI process

rd_num: Number of per-peer receive buffers posted when a connection ismade between two MPI processes. I.e., the first time you MPI_SEND/MPI_RECV between a pair of MPI peers, rd_num buffers are posted forincoming messages. More on this below.

rd_low: When the number of available buffers left on a per-peer queuepair reaches this number (the low watermark), it is time to post more.

rd_win: When the number of available buffers left on a per-peer queuepair reaches this number, send a flow control message to the peer.

max_eager_rdma: How many buffers to post for "eager" RDMA shortmessages between explicit pairs of MPI processes. Note that eagerRDMA is only used between a fixed number of pairs of peers in order toa) conserve registered memory and b) limit the number of memorylocations that must be polled to check for message passing progress.Check out this [relatively new] FAQ entry for more details: http://www.open-mpi.org/faq/?category=openfabrics#ib-small-message-rdma

mpool_base_use_mem_hooks: If compiled with support for it (which isusually the default), allow the use of the mpi_leave_pinned parameter.

mpi_leave_pinned: The simple description of this parameter is that ifyou use the same buffers repeatedly for sending and receiving buffers,enabling mpi_leave_pinned will likely result in a performance boost.Check out these [relatively] new FAQ entries for more details: http://www.open-mpi.org/faq/?category=openfabrics#large-message-tuning-1.2and http://www.open-mpi.org/faq/?category=openfabrics#large-message-leave-pinned.Note that long message tuning parameters are changing slightly in theupcoming v1.3 series. Check out this FAQ entry:


http://www.open-mpi.org/faq/?category=openfabrics#large-message-tuning-1.3

Does this help? Sorry it took so long to answer your questions;please feel free to ask more.


--
Jeff Squyres
Cisco Systems

Re: [OMPI users] btl_openib_rd_{num, low} parameters? (Was Re: ConnectX hang with 1.2.5, crash with 1.3, during gather)

Reply via email to