Tim,

Thanks for the bug report. I just commit a patch in our development version (revision 13079). It will go into the 1.2b2 soon, after some soak time. Until then please use the latest nightly tar (with a version bigger than 13079) from our website.

  Thanks,
    george.

On Jan 10, 2007, at 5:19 PM, Tim Campbell wrote:

Greetings,

Attached is a small test fortran program that triggers a failure in the mpi_waitall. The problem is that the after a couple of calls to mpi_startall and mpi_waitall some of the mpi_requests become corrupted. This causes the next call to mpi_startall to fail. Here is output from a 2 cpu run.

[44]% mpif90 -g test_ompi.f
[45]% mpirun -np 2 a.out
TEST(A):   0  1 |        2       3       4       5
TEST(B):   0  1 |        2       3       4       5
OUTPUT:   0  1 |      100     100     101     101
TEST(A):   0  2 |        2       3       4       5
TEST(B):   0  2 |   -32766  -32766       4       5
OUTPUT:   0  2 |      200     200     201     201
TEST(A):   1  1 |        2       3       4       5
TEST(B):   1  1 |        2       3       4       5
OUTPUT:   1  1 |      101     101     100     100
TEST(A):   1  2 |        2       3       4       5
TEST(B):   1  2 |   -32766  -32766       4       5
OUTPUT:   1  2 |      201     201     200     200
^Cmpirun: killing job...

The "-32766" values show up in the mpi_request array after the second call to mpi_waitall. Using prints in the OpenMPI code I have tracked the problem to

ompi/request/req_wait.c:ompi_request_wait_all().

I find upon entry to ompi_request_wait_all() that the values of request[:]->req_f_to_c_index are valid. However, upon exit of ompi_request_wait_all() the first two entries of request[:]- >req_f_to_c_index have the value of -32766.

I am testing with OpenMPI version 1.2b2. This problem occurs on both x86_64 and Intel i386 and it occurs for both Portland Group compilers and for GCC/G95.

Cheers,
Tim Campbell
Naval Research Laboratory
<test_ompi.f.gz>

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to