On Nov 25, 2008, at 10:29 AM, Максим Чусовлянов wrote:

Hello! How i can integrated my collective communication algorithm in openMPI with MCA?

Sorry for the delay in answering -- SC08 and the US holiday last week got in the way and I'm way behind on answering the mails in my INBOX.

Just to make sure we're talking about the same thing -- you have a new collective algorithm for one of the MPI collective functions, and you want to include that code in Open MPI so that it can be invoked by MPI_<foo> in MPI applications, right?

If so, the right way to do this is to build a new Open MPI "coll" (collective) component containing the code for your new algorithm. Our coll components are basically a few housekeeping functions and a bunch of function pointers for the functions to call that are the back-ends to MPI collective functions (i.e., MPI_Bcast and friends).

All the "coll" component code is under the ompi/mca/coll/ directory. The "base" directory is some "glue" code for the coll framework itself -- it's not a component. But all other directories are standalone components that have corresponding dynamic shared objects (DSOs) installed under $pkglibdir (typically $prefix/lib/openmpi).

You can build a component inside or outside of the Open MPI tree. If you build outside of the Open MPI tree, you need to configure OMPI with --with-devel-headers, which will install all of OMPI's internal headers under $prefix. That way, you can -I these headers when you compile your component. Just install your DSO in $pkglibdir; if all goes well, "ompi_info | grep coll" should show your component.

If you build inside of the Open MPI tree, you need to make your component dir under ompi/mca/coll/ and include a configure.params file (look at ompi/mca/coll/basic/configure.params for a simple example) and a Makefile.am (see ompi/mca/coll/basic/Makefile.am for an example). Then run the "autogen.sh" script that is at the top of the tree and then run configure. You should see your component listed in both the autogen.sh and configure output; configure should not that it plans to build that component. When you finish configure, build and install Open MPI. "ompi_info | grep coll" should show your component.

But I'm getting ahead of myself...  Let's go back a few steps...

When building inside the OMPI tree, if you need to check for various things to determine if you can build the component (i.e., some tests during configure, such as checking for various hardware support libraries), you can also add a configure.m4 file in your component's directory. This gets a little tricky if you're not familiar with Autoconf; let me know if you need some guidance here.

Now you can add the source code to the component. We have 2 important abstractions that you need to know about:

- component: there is only one component instance in an MPI process. It has global state. - module: in the coll framework, there is one module instance for every communicator that uses this component. It has local state relevant to that specific communicator.

Think of "component" as a C++ class, and "module" as a C++ object.

Now read the comments in ompi/mca/coll/coll.h. This file contains the struct interfaces for both the coll component and module. We basically do everything by function pointer; the component returns a set of function pointers and each module returns a struct of function pointers. These function pointers are invoked by libmpi at various times for various functions; see coll.h for a description of each.

During coll module initialization (i.e., when a new communicator has been created), there's a process called "selection" where OMPI determines which coll modules will be used on this communicator. Modules can include/exclude themselves from the selection process. For example, your algorithm may only be suitable for intracommunicators. So if the communicator in question that is being created is an intercommunicator, you probably want to exclude your module from selection. Or if your algorithm can only handle powers-of- two MPI processes, it should exclude itself if there is a non-power-of- two number of processes in the communicator. And so on.

We designed coll modules in OMPI v1.3 to be "mix-n-match"-able such that in a single communicator, you can use the broadcast function from one module, but the gather function from a different module. Hence, multiple coll modules may be active on a single communicator. In your case, you'll need to make sure that your function has a higher priority than the "tuned" coll component (which is the default in many cases).

I'd suggest working in the Open MPI v1.3 tree, as we're going to release this version soon and all future work is being done here (vs. the v1.2 tree, which will eventually be deprecated).

Hopefully this is enough information to get you going. Please feel free to ask more questions! But you might want to post followup questions to the devel list; these aren't really user-level questions.

Good luck!

--
Jeff Squyres
Cisco Systems


Reply via email to