On Feb 20, 2009, at 2:20 PM, Jim Kusznir wrote:

I just went to www.open-mpi.org, went to download, then source rpm.
Looks like it was actually 1.3-1.  Here's the src.rpm that I pulled
in:

http://www.open-mpi.org/software/ompi/v1.3/downloads/openmpi-1.3-1.src.rpm

Ah, gotcha. Yes, that's 1.3.0, SRPM version 1. We didn't make up this nomenclature. :-(

The reason for this upgrade is it seems a user found some bug that may
be in the OpenMPI code that results in occasionally an MPI_Send()
message getting lost.  He's managed to reproduce it multiple times,
and we can't find anything in his code that can cause it...He's got
logs of mpi_send() going out, but the matching mpi_receive() never
getting anything, thus killing his code.  We're currently running
1.2.8 with ofed support (Haven't tried turning off ofed, etc. yet).

Ok. 1.3.x is much mo' betta' then 1.2 in many ways. We could probably help track down the problem, but if you're willing to upgrade to 1.3.x, it'll hopefully just make the problem go away.

Can you try a 1.3.1 nightly tarball?

--
Jeff Squyres
Cisco Systems

Reply via email to