On Mon, 31 Oct 2005 20:33:06 -0700, Jeff Squyres <jsquy...@open-mpi.org>
wrote:
On Oct 28, 2005, at 3:08 PM, Jeff Squyres wrote:
1. I'm concerned about the MPI_Reduce error -- that one shouldn't be
happening at all. We have table lookups for the MPI_Op/MPI_Datatype
combinations that are supposed to work; the fact that you're getting
this error means that HPCC is using a combination that falls outside
the pairs that are defined in the MPI standard. Sigh. But it's HPCC,
so we should support it ;-).
I'll eat crow on this one -- double checking the HPCC code, it looks
like they are doing reductions on MPI_LONG_LONG_INT, which is perfectly
legal (MPI_LONG_LONG_INT is not specifically mentioned in the
collectives section in MPI-1, but it's one of the "optional" C
datatypes, and falls within the spirit of the definition of "C integer"
in the collectives section). Despite having implementations for all
the relevant reductions in Open MPI, I forgot to add MPI_LONG_LONG_INT
into some MPI_Op cross-reference datatype tables, so MPI_Reduce didn't
think that those combinations were valid. Doh!
I just committed the fix for this on the trunk; everyone's asleep right
now, but I'll get a review of this code and get it committed on the 1.0
branch tomorrow. :-)
My coworkers and I joke that we were hired for our knack at breaking
software; OpenMPI will likely suffer a fair amount of our attention. You
have to do a lot of hammerin' to turn iron into steel, but the result is
worth it. If only I knew enough about implementing an MPI to be helpful
in solving the problems, rather than just finding them...