Re: [OMPI users] tickets 39 & 55

Jeff Squyres Thu, 2 Nov 2006 12:01:54 -0500

Adding Craig Rasmussen from LANL into the CC list...

On Oct 31, 2006, at 10:26 AM, Michael Kluskens wrote:

OpenMPI tickets 39 & 55 deal with problems with the Fortran 90large interface with regards to:
#39: MPI_IN_PLACE in MPI_REDUCE <https://svn.open-mpi.org/trac/ompi/ticket/39>#55: MPI_GATHER with arrays of different dimensions <https://svn.open-mpi.org/trac/ompi/ticket/55>
Attached is a patch to deal with these two issues as appliedagainst OpenMPI-1.3a1r12364.

Thanks for the patch! Before committing this, though, I think moreneeds to be done and I want to understand it before doing so (part ofthis is me thinking it out while I write this e-mail...). Also, beaware that SC is 1.5 weeks away, so I may not be able to get toaddress this issue before then (SC tends to be all-consuming).

1. The "same type" heuristic for the "large" F90 module was notintended to cover all possible scenarios. You're absolutely rightthat assuming the same time makes no sense for some of theinterfaces. The problem is that the obvious alternative (allpossible scenarios) creates an exponential number of interfaces (inthe millions). So "large" was an attempt to provide *some* of theinterfaces -- but [your] experience has shown that this can do moreharm than good (i.e., make some legal MPI applications uncompilablebecause we provide *some* interfaces to MPI_GATHER, but not all).

1a. It gets worse because of MPI's semantics for MPI_GATHER. Youpointed out one scenario -- it doesn't make sense to supply "integer"for both the sendbuf and recvbuf because the root will need aninteger array to receive all the values (similar logic applies toMPI_SCATTER and other collectives -- so what you did for MPI_GATHERwould need to be applied to several others as well).

1b. But even worse than that is the fact that, for MPI_GATHER, thereceive buffer is not relevant on non-root processes. So it's validfor *any* type to be passed for non-root processes (leading to theexponential interface explosion described above).

So having *some* interfaces for MPI_GATHER can be a problem for both1a and 1b -- perfectly valid/legal MPI apps will fail to compile.

I'm not sure what the right balance is here -- how do we allow forboth 1a and 1b without creating millions of interfaces? Your patchcreated MPI_GATHER interfaces for all the same types, but allowingany dimension mix. With the default max dimension level of 4 inOMPI's interfaces, this created 90 new interfaces for MPI_GATHER,calculated (and verified with some grep/wc'ing):


For src buffer of dimension:    0   1   2   3   4
Create this many recvbuf types: 4 + 4 + 3 + 2 + 1 = 14

For each src/recvbuf combination, create this many interfaces:

(char + logical + (integer * 4) + (real * 2) + (complex * 2)) = 10

Where 4, 2, and 2 are the number of integer, real, and complex typessupported by the compiler on my machines (e.g., gfortran on OSX/inteland Linux/EM64T).

So this created 14 * 10 = 140 interfaces, as opposed to the 50 thatwere there before the patch (5 dimensions of src/recvbuf * 10 types =50), resulting in 90 new interfaces.


This effort will need to be duplicated by several other collectives:

- allgather, allgatherv
- alltoall, alltoallv, alltoallw
- gather, gatherv
- scatter, scatterv

So an increase of 9 * 90 = 810 new interfaces. Not too bad,considering the alternative (exponential). But consider that the"large" interface only has (by my count via egrep/wc) 4013interfaces. This would be increasing its size by about 20%. This iscertainly not a show-stopper, but something to consider.

Note that if you go higher than OMPI's default 4 dimensions, thenumber of new interfaces gets considerably larger (e.g., for 7dimensions you get 35 send/recv type combinations instead of 14, so(35 * 10 * 9) = 3150 total interfaces (just for the collectives), ifI did my math right.

2. You also identified another scenario that needs to be fixed --support for MPI_IN_PLACE in certain collectives (MPI_REDUCE is notthe only collective that supports it). It doesn't seem to be a GoodIdea to allow the INTEGER type to be mixed with any other type forsend/recvbuf combinations, just to allow MPI_IN_PLACE. Thispotentially adds in send/recvbuf signatures that we want to disallow(even though they are potentially valid MPI applications!) -- e.g.,INTEGER and FLOAT. What if a user accidentally supplied an INTEGERfor the sendbuf that wasn't MPI_IN_PLACE? That's what the typesystem is supposed to be preventing.

I don't know enough about the type system of F90, but it strikes methat we should be able to create a unique type for MPI_IN_PLACE(don't know why I didn't think of this before for some of the MPIsentinel values... :-\ ) and therefore have a safe mechanism for thissentinel value.

This would add 10 interfaces for every function that supportsMPI_IN_PLACE; a pretty small increase.

This same technique should probably be applied to some of the othersentinel values, such as MPI_ARGVS_NULL and MPI_STATUSES_IGNORE.


---------------

All that being said, what does it mean?

I think #2 is easily enough fixed (just require the time to do so),and has minimal impact on the number of interfaces. Implementing MPIsentinel values with unique types also makes user apps that much moresafe (i.e., they won't accidentally pass in an incorrect type thatwould be mistaken -- by the interface -- for a valid signature).

#1 is still a problem. No matter how we slice it, we're going toleave out valid combinations of send/recv buffers that will preventpotentially legal MPI applications from compiling. This is asopposed to not having F90 interfaces for the 2-choice-bufferfunctions at all, which would mean that F90 apps using MPI_GATHER(for example) would simply fall back to the F77 interfaces where notype checking is done. End result: all MPI F90 apps can compile.

Simply put, with the trivial, small, and medium module sizes, allvalid MPI F90 applications can compile and run. With the large size,unless we do the exponential interface explosion, we will bepotentially excluding some legal MPI F90 applications -- they *willnot be able to compile* (without workarounds). This is what I meantby ticket 55's title "F90 "large" interface may not entirely makesense".


So there are multiple options here:

1. Keep chasing a "good" definition of "large" such that most/allcurrent MPI F90 apps can compile. The problem is that this targetcan change over time, and keep requiring maintenance.

2. Stop pursuing "large" because of the problems mentioned above.This has the potential problem of not providing type safety to F90MPI apps for the MPI collective interfaces, but at least all apps cancompile, and there's only a small number of 2-choice-buffer functionsthat do not get the type safety from F90 (i.e., several MPIcollective functions).

3. Start implementing the proposed F03 MPI interfaces that don't havethe same problems as the F90 MPI interfaces.

I have to admit that I'm leaning more towards #2 (and I wish thatsomeone who has the time would do #3!) and discarding #1...


Comments?

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Re: [OMPI users] tickets 39 & 55

Reply via email to