John R. Cary wrote:
Jeff Squyres wrote:
(for the web archives)
Brock and I talked about this .f90 code a bit off list -- he's going
to investigate with the test author a bit more because both of us are
a bit confused by the F90 array syntax used.
Attached is a simple send/recv code written (procedural) C++ that
illustrates a similar problem. It dies at a random number of iterations
with openmpi-1.3.2 or .3. (I have submitted this before.) On some
machines
this goes away with the "-mca btl_sm_num_fifos 8" or
"-mca btl ^sm", so I think this is
https://svn.open-mpi.org/trac/ompi/ticket/2043.
I suppose so. GCC 4.4.0. We've made a bit of progress on this
recently, but again I don't know how much further we have to go. I
posted a C-only stand-alone example to the ticket, but would appreciate
anyone jumping in and looking at it further. George has taken a peek so
far.
Since it has barriers after each send/recv pair, I do not understand
how any buffers could fill up.
Right. For 2043, it seems there is a race condition when two process
write to the same, on-node receiver. It's possible to observe the
problem with nothing but repeated barriers.
On Dec 1, 2009, at 10:46 AM, Brock Palen wrote:
The attached code, is an example where openmpi/1.3.2 will lock up, if
ran on 48 cores, of IB (4 cores per node),
The code loops over recv from all processors on rank 0 and sends from
all other ranks, as far as I know this should work, and I can't see
why not.
Okay. Presumably the IB part is irrelevent. Just having one node with
multiple senders sending to a common receiver should do the job.
Note that if I increase the openib eager limit, the program runs,
which normally means improper MPI, but I can't on my own figure out
the problem with this code.
This conflicts with the theory that it's trac 2043. Similarly, having
longer messages *suggests* (but does not prove) that the problem is
something else.
/**
* A simple test program to demonstrate a problem in OpenMPI 1.3
*
* Make with:
* mpicxx -o ompi1.3.3-bug ompi1.3.3-bug.cxx
*
* Run with:
* mpirun -n 3 ompi1.3.3-bug
*/
// mpi includes
#include <mpi.h>
// std includes
#include <iostream>
#include <vector>
// useful hashdefine
#define ARRAY_SIZE 250
/**
* Main driver
*/
int main(int argc, char** argv) {
// Initialize MPI
MPI_Init(&argc, &argv);
int rk, sz;
MPI_Comm_rank(MPI_COMM_WORLD, &rk);
MPI_Comm_size(MPI_COMM_WORLD, &sz);
// Create some data to pass around
std::vector<double> d(ARRAY_SIZE);
// Initialize to some values if we aren't rank 0
if ( rk )
for ( unsigned i = 0; i < ARRAY_SIZE; ++i )
d[i] = 2*i + 1;
// Loop until this breaks
unsigned t = 0;
while ( 1 ) {
MPI_Status s;
if ( rk )
MPI_Send( &d[0], d.size(), MPI_DOUBLE, 0, 3, MPI_COMM_WORLD );
else
for ( int i = 1; i < sz; ++i )
MPI_Recv( &d[0], d.size(), MPI_DOUBLE, i, 3, MPI_COMM_WORLD, &s );
MPI_Barrier(MPI_COMM_WORLD);
std::cout << "Transmission " << ++t << " completed." << std::endl;
}
// Finalize MPI
MPI_Finalize();
}