Looking at this a little closer on the v1.2 branch, it does look like it could be a bug.

The child definitely does not return from INTERCOMM_MERGE until the parent enters MPI_RECV. So I put in a bogus MPI_TEST call before the parent calls MPI_RECV, and that also causes the child the return from INTERCOMM_MERGE. That makes it sound like we have something that is not finishing progress properly before leaving INTERCOMM_MERGE; calling progress again (e.g., calling MPI_TEST or the MPI_RECV) makes enough happen that allows the children to complete the INTERCOMM_MERGE. :-\

To be honest, I don't think we'll be too motivated to fix this in the old v1.2 series because we're getting darn close to putting out v1.3. Support for the dynamics and the progression engine have changed a *lot* behind the scenes in v1.3.

To be specific: this problem doesn't seem to happen in the code for the upcoming v1.3 release, however (I would not encourage using a nightly snapshot at the moment; we have a fairly gnarly bug in other kinds of progression issues that needs to be fixed).


On Jul 28, 2008, at 5:02 PM, Jeff Squyres wrote:

On Jul 28, 2008, at 4:56 PM, Aurélien Bouteiller wrote:

Having different values is fine for high parameter.

I think the problem comes from using NULL, NULL instead of &argc, &argv as parameters for MPI_Init.

Calling MPI_INIT with NULL, NULL is legal; we don't actually do anything with those values, IIRC.

--
Jeff Squyres
Cisco Systems


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


Reply via email to