I'm part of a team that maintains a global climate model running under
mpi. Recently we have been trying it out with different mpi stacks
at high resolution/processor counts.
At one point in the code there is a large number of mpi_isends/mpi_recv
(tens to hundreds of thousands) when data distributed across all mpi
processes must be collective on a particular processor or processors be
transformed to a new resolution before writing. At first the model was
crashing with a message:
"A process failed to create a queue pair. This usually means either the
device has run out of queue pairs (too many connections) or there are
insufficient resources available to allocate a queue pair (out of
memory). The latter can happen if either 1) insufficient memory is
available, or 2) no more physical memory can be registered with the device."
when it hit the part of code with the send/receives. Watching the node
memory in an xterm I could see the memory skyrocket and fill the node.
Somewhere we found a suggestion try using the xrc queues
(http://www.open-mpi.org/faq/?category=openfabrics#ib-xrc) to get around
this problem and indeed running with
setenv OMPI_MCA_btl_openib_receive_queues
"X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128,32"
mpirun --bind-to-core -np numproc ./app
allowed the model to successfully run. It still seems to use a large
amount of memory when it writes (on the order of several Gb). Does
anyone have any suggestions on how to perhaps tweak the settings to
help with memory use.
--
Ben Auer, PhD SSAI, Scientific Programmer/Analyst
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-286-9176 Fax: 301-614-6246