Hi Joseph,

I built this test with craypich (Cray MPI) and it passed.  I also tried
with Open MPI master and the test passed.  I also tried with 2.0.2
and can't seem to reproduce on my system.

Could you post the output of config.log?

Also, how intermittent is the problem?


Thanks,

Howard




2017-03-01 8:03 GMT-07:00 Joseph Schuchart <schuch...@hlrs.de>:

> Hi all,
>
> We are seeing issues in one of our applications, in which processes in a
> shared communicator allocate a shared MPI window and execute MPI_Accumulate
> simultaneously on it to iteratively update each process' values. The test
> boils down to the sample code attached. Sample output is as follows:
>
> ```
> $ mpirun -n 4 ./mpi_shared_accumulate
> [1] baseptr[0]: 1010 (expected 1010)
> [1] baseptr[1]: 1011 (expected 1011)
> [1] baseptr[2]: 1012 (expected 1012)
> [1] baseptr[3]: 1013 (expected 1013)
> [1] baseptr[4]: 1014 (expected 1014)
> [2] baseptr[0]: 1005 (expected 1010) [!!!]
> [2] baseptr[1]: 1006 (expected 1011) [!!!]
> [2] baseptr[2]: 1007 (expected 1012) [!!!]
> [2] baseptr[3]: 1008 (expected 1013) [!!!]
> [2] baseptr[4]: 1009 (expected 1014) [!!!]
> [3] baseptr[0]: 1010 (expected 1010)
> [0] baseptr[0]: 1010 (expected 1010)
> [0] baseptr[1]: 1011 (expected 1011)
> [0] baseptr[2]: 1012 (expected 1012)
> [0] baseptr[3]: 1013 (expected 1013)
> [0] baseptr[4]: 1014 (expected 1014)
> [3] baseptr[1]: 1011 (expected 1011)
> [3] baseptr[2]: 1012 (expected 1012)
> [3] baseptr[3]: 1013 (expected 1013)
> [3] baseptr[4]: 1014 (expected 1014)
> ```
>
> Each process should hold the same values but sometimes (not on all
> executions) random processes diverge (marked through [!!!]).
>
> I made the following observations:
>
> 1) The issue occurs with both OpenMPI 1.10.6 and 2.0.2 but not with MPICH
> 3.2.
> 2) The issue occurs only if the window is allocated through
> MPI_Win_allocate_shared, using MPI_Win_allocate works fine.
> 3) The code assumes that MPI_Accumulate atomically updates individual
> elements (please correct me if that is not covered by the MPI standard).
>
> Both OpenMPI and the example code were compiled using GCC 5.4.1 and run on
> a Linux system (single node). OpenMPI was configure with
> --enable-mpi-thread-multiple and --with-threads but the application is not
> multi-threaded. Please let me know if you need any other information.
>
> Cheers
> Joseph
>
> --
> Dipl.-Inf. Joseph Schuchart
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstr. 19
> D-70569 Stuttgart
>
> Tel.: +49(0)711-68565890
> Fax: +49(0)711-6856832
> E-Mail: schuch...@hlrs.de
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to