Re: [OMPI users] MPI_AllReduce design for shared-memory.

Jeff Squyres Tue, 14 Aug 2007 17:45:38 -0400

The primary person you need to talk to is turning in her dissertationwithin the next few days. So I think she's kinda busy at themoment... :-)


Sorry for the delay -- I'll take a shot at answers below...



On Aug 14, 2007, at 4:39 PM, smai...@ksu.edu wrote:

Can anyone help on this?

-Thanks,
Sarang.

Quoting smai...@ksu.edu:

Hi,
I am doing a research on parallel techniques for shared-memory
systems(NUMA). I understand that OpenMPI is intelligent to utilize
shared-memory system and it uses processor-affinity.


Open MPI has coarse-grained processor-affinity control, see:

http://www.open-mpi.org/faq/?category=tuning#using-paffinity

Expect to see more functionality / flexibility here in the future...

Is the OpenMPI design of MPI_AllReduce "same" for shared-memory(NUMA) as well as distributed system? Can someone please tell meMPI_AllReduce design, in brief, in terms of processes and theirinteraction on shared-memory?

Open MPI is fundamentally based on plugins. We have plugins in forvarious flavors of collective algorithms (see the code base: ompi/mca/coll/), one of which is "sm" (shared memory). The shared memorycollectives are currently quite limited but are being expanded andimproved by Indiana University (e.g., IIRC, allreduce uses the sharedmemory reduce followed by a shared memory bcast).

The "tuned" collective plugin has its own implementation(s) ofAllreduce -- Jelena or George will have to comment here. They do notassume shared memory; they use well-known algorithms for allreduce.The "tuned" component basically implements a wide variety ofalgorithms for each MPI collective and attempts to choose which onewill be best to use at run-time. U. Tennessee has done a lot of workin this area and I think they have several published papers on it.

The "basic" plugin is the dirt-simple correct-but-not-optimizedcomponent that does simple linear and logarithmic algorithms for allthe MPI collectives. If we don't have a usable algorithm anywhereelse, we fall back to the basic plugin (e.g., allreduce is a reducefollowed by a bcast).

Else please suggest me a good reference for this.

Our basic philosophy / infrastructure for MPI collectives is based onthis paper:


    http://www.open-mpi.org/papers/ics-2004/

Although work that happened literally last week is just about to hitthe development trunk (within a week or so -- still doing somedebugging) that brings Goodness to allowing a first-level of mixing-n-matching between collective components that do not provide all theMPI algorithms. I can explain more if you care.


Hope this helps...

--
Jeff Squyres
Cisco Systems

Re: [OMPI users] MPI_AllReduce design for shared-memory.

Reply via email to