We have a large ensemble-based atmospheric data assimilation system that
does a 3-D cartesian partitioning of the 'domain' using MPI_DIMS_CREATE,
MPI_CART_CREATE, etc.  Two of the dimensions are spacial, i.e. latitude
and longitude; the third is an 'ensemble' dimension, across which
subsets of ensemble members are held.

Most MPI communication is in the two spacial dimensions, while
calculations in the ensemble dimension are essentially embarrassingly
parallel except for an occasional collective reduction call.  On a
typical system with multi-core nodes, we would like to map the spacial
dimensions on-node, to get the benefit of shared memory communication,
and map the ensemble dimension across nodes, since there isn't much
communication required.  For example, for a 90 member ensemble case
running on 6-16 core nodes (96 cores), we might do 4x4 mapping in the
spacial dimension, and have 6 as the ensemble dimension (15 members
each).

My question is this:  If the cartesian mapping is done so the two
spacial dimensions are the 'most rapidly varying' in equivalent 1-D
processor mapping, will Open-mpi automatically assign those 2 dimensions
'on-node', and assign the 'ensemble' dimension as the slowest varying
and across nodes?  If not, how can we guarantee this to happen?

T. Rosmond





Reply via email to