John R. Cary wrote:

Jeff Squyres wrote:

(for the web archives)

Brock and I talked about this .f90 code a bit off list -- he's going to investigate with the test author a bit more because both of us are a bit confused by the F90 array syntax used.

Attached is a simple send/recv code written (procedural) C++ that
illustrates a similar problem.  It dies at a random number of iterations
with openmpi-1.3.2 or .3. (I have submitted this before.) On some machines
this goes away with the "-mca btl_sm_num_fifos 8" or
"-mca btl ^sm", so I think this is
https://svn.open-mpi.org/trac/ompi/ticket/2043.

I suppose so. GCC 4.4.0. We've made a bit of progress on this recently, but again I don't know how much further we have to go. I posted a C-only stand-alone example to the ticket, but would appreciate anyone jumping in and looking at it further. George has taken a peek so far.

Since it has barriers after each send/recv pair, I do not understand how any buffers could fill up.

Right. For 2043, it seems there is a race condition when two process write to the same, on-node receiver. It's possible to observe the problem with nothing but repeated barriers.

On Dec 1, 2009, at 10:46 AM, Brock Palen wrote:

The attached code, is an example where openmpi/1.3.2 will lock up, if
ran on 48 cores, of IB (4 cores per node),
The code loops over recv from all processors on rank 0 and sends from
all other ranks, as far as I know this should work, and I can't see
why not.

Okay. Presumably the IB part is irrelevent. Just having one node with multiple senders sending to a common receiver should do the job.

Note that if I increase the openib eager limit, the program runs,
which normally means improper MPI, but I can't on my own figure out
the problem with this code.

This conflicts with the theory that it's trac 2043. Similarly, having longer messages *suggests* (but does not prove) that the problem is something else.

/**
* A simple test program to demonstrate a problem in OpenMPI 1.3
*
* Make with:
* mpicxx -o ompi1.3.3-bug ompi1.3.3-bug.cxx
*
* Run with:
* mpirun -n 3 ompi1.3.3-bug
*/

// mpi includes
#include <mpi.h>

// std includes
#include <iostream>
#include <vector>

// useful hashdefine
#define ARRAY_SIZE 250

/**
* Main driver
*/
int main(int argc, char** argv) {
// Initialize MPI
 MPI_Init(&argc, &argv);

 int rk, sz;
 MPI_Comm_rank(MPI_COMM_WORLD, &rk);
 MPI_Comm_size(MPI_COMM_WORLD, &sz);

// Create some data to pass around
 std::vector<double> d(ARRAY_SIZE);

// Initialize to some values if we aren't rank 0
 if ( rk )
   for ( unsigned i = 0; i < ARRAY_SIZE; ++i )
     d[i] = 2*i + 1;

// Loop until this breaks
 unsigned t = 0;
 while ( 1 ) {
   MPI_Status s;
   if ( rk )
     MPI_Send( &d[0], d.size(), MPI_DOUBLE, 0, 3, MPI_COMM_WORLD );
   else
     for ( int i = 1; i < sz; ++i )
       MPI_Recv( &d[0], d.size(), MPI_DOUBLE, i, 3, MPI_COMM_WORLD, &s );
   MPI_Barrier(MPI_COMM_WORLD);
   std::cout << "Transmission " << ++t << " completed." << std::endl;
 }

// Finalize MPI
 MPI_Finalize();
}

Reply via email to