Hi Joseph, I built this test with craypich (Cray MPI) and it passed. I also tried with Open MPI master and the test passed. I also tried with 2.0.2 and can't seem to reproduce on my system.
Could you post the output of config.log? Also, how intermittent is the problem? Thanks, Howard 2017-03-01 8:03 GMT-07:00 Joseph Schuchart <schuch...@hlrs.de>: > Hi all, > > We are seeing issues in one of our applications, in which processes in a > shared communicator allocate a shared MPI window and execute MPI_Accumulate > simultaneously on it to iteratively update each process' values. The test > boils down to the sample code attached. Sample output is as follows: > > ``` > $ mpirun -n 4 ./mpi_shared_accumulate > [1] baseptr[0]: 1010 (expected 1010) > [1] baseptr[1]: 1011 (expected 1011) > [1] baseptr[2]: 1012 (expected 1012) > [1] baseptr[3]: 1013 (expected 1013) > [1] baseptr[4]: 1014 (expected 1014) > [2] baseptr[0]: 1005 (expected 1010) [!!!] > [2] baseptr[1]: 1006 (expected 1011) [!!!] > [2] baseptr[2]: 1007 (expected 1012) [!!!] > [2] baseptr[3]: 1008 (expected 1013) [!!!] > [2] baseptr[4]: 1009 (expected 1014) [!!!] > [3] baseptr[0]: 1010 (expected 1010) > [0] baseptr[0]: 1010 (expected 1010) > [0] baseptr[1]: 1011 (expected 1011) > [0] baseptr[2]: 1012 (expected 1012) > [0] baseptr[3]: 1013 (expected 1013) > [0] baseptr[4]: 1014 (expected 1014) > [3] baseptr[1]: 1011 (expected 1011) > [3] baseptr[2]: 1012 (expected 1012) > [3] baseptr[3]: 1013 (expected 1013) > [3] baseptr[4]: 1014 (expected 1014) > ``` > > Each process should hold the same values but sometimes (not on all > executions) random processes diverge (marked through [!!!]). > > I made the following observations: > > 1) The issue occurs with both OpenMPI 1.10.6 and 2.0.2 but not with MPICH > 3.2. > 2) The issue occurs only if the window is allocated through > MPI_Win_allocate_shared, using MPI_Win_allocate works fine. > 3) The code assumes that MPI_Accumulate atomically updates individual > elements (please correct me if that is not covered by the MPI standard). > > Both OpenMPI and the example code were compiled using GCC 5.4.1 and run on > a Linux system (single node). OpenMPI was configure with > --enable-mpi-thread-multiple and --with-threads but the application is not > multi-threaded. Please let me know if you need any other information. > > Cheers > Joseph > > -- > Dipl.-Inf. Joseph Schuchart > High Performance Computing Center Stuttgart (HLRS) > Nobelstr. 19 > D-70569 Stuttgart > > Tel.: +49(0)711-68565890 > Fax: +49(0)711-6856832 > E-Mail: schuch...@hlrs.de > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users