You can not perform synchronization at the same time as communication on the same target. This means if one thread is in MPI_Put/MPI_Get/MPI_Accumulate (target) you can’t have another thread in MPI_Win_flush (target) or MPI_Win_flush_all(). If your program is doing that it is not a valid MPI program. If you want to ensure a particular put operation is complete try MPI_Rput instead.
-Nathan > On Feb 19, 2017, at 2:34 PM, Joseph Schuchart <schuch...@hlrs.de> wrote: > > All, > > We are trying to combine MPI_Put and MPI_Win_flush on locked (using > MPI_Win_lock_all) dynamic windows to mimic a blocking put. The application is > (potentially) multi-threaded and we are thus relying on MPI_THREAD_MULTIPLE > support to be available. > > When I try to use this combination (MPI_Put + MPI_Win_flush) in our > application, I am seeing threads occasionally hang in MPI_Win_flush, probably > waiting for some progress to happen. However, when I try to create a small > reproducer (attached, the original application has multiple layers of > abstraction), I am seeing fatal errors in MPI_Win_flush if using more than > one thread: > > ``` > [beryl:18037] *** An error occurred in MPI_Win_flush > [beryl:18037] *** reported by process [4020043777,2] > [beryl:18037] *** on win pt2pt window 3 > [beryl:18037] *** MPI_ERR_RMA_SYNC: error executing rma sync > [beryl:18037] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort, > [beryl:18037] *** and potentially your MPI job) > ``` > > I could only trigger this on dynamic windows with multiple concurrent threads > running. > > So: Is this a valid MPI program (except for the missing clean-up at the end > ;))? It seems to run fine with MPICH but maybe they are more tolerant to some > programming errors... > > If it is a valid MPI program, I assume there is some race condition in > MPI_Win_flush that leads to the fatal error (or the hang that I observe > otherwise)? > > I tested this with OpenMPI 1.10.5 on single node Linux Mint 18.1 system with > stock kernel 4.8.0-36 (aka my laptop). OpenMPI and the test were both > compiled using GCC 5.3.0. I could not run it using OpenMPI 2.0.2 due to the > fatal error in MPI_Win_create (which also applies to MPI_Win_create_dynamic, > see my other thread, not sure if they are related). > > Please let me know if this is a valid use case and whether I can provide you > with additional information if required. > > Many thanks in advance! > > Cheers > Joseph > > -- > Dipl.-Inf. Joseph Schuchart > High Performance Computing Center Stuttgart (HLRS) > Nobelstr. 19 > D-70569 Stuttgart > > Tel.: +49(0)711-68565890 > Fax: +49(0)711-6856832 > E-Mail: schuch...@hlrs.de > > <ompi_flush_hang.c>_______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users