Hi,there seems to be a bug in MPI_Win_lock/MPI_Win_unlock in OpenMPI 1.10. The same code runs fine with OpenMPI 1.8.7 and Intel MPI.
This happens when using MPI_Win_lock with MPI_MODE_CHECK. I could reproduce this with the attached code with one rank and MPI_LOCK_EXCLUSIVE as well as MPI_LOCK_SHARED. OpenMPI is compiled with "--enable-mpi-thread-multiple" (not sure if this is important).
Here is the error message:
/work/local/openmpi/bin/mpicxx test.cpp && /work/local/openmpi/bin/mpiexec -np 1 ./a.out [hpcsccs4:29012] *** Process received signal *** [hpcsccs4:29012] Signal: Segmentation fault (11) [hpcsccs4:29012] Signal code: Address not mapped (1) [hpcsccs4:29012] Failing at address: 0x3c [hpcsccs4:29012] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7fe8585ccd40] [hpcsccs4:29012] [ 1] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_process_flush_ack+0x53)[0x7fe84e51a303] [hpcsccs4:29012] [ 2] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(+0x132bc)[0x7fe84e5162bc] [hpcsccs4:29012] [ 3] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_request_progress_match+0x393)[0x7fe84f1915e3] [hpcsccs4:29012] [ 4] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_req_start+0x21b)[0x7fe84f194a8b] [hpcsccs4:29012] [ 5] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_start+0x90)[0x7fe84f19bd40] [hpcsccs4:29012] [ 6] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_irecv_w_cb+0x55)[0x7fe84e5133b5] [hpcsccs4:29012] [ 7] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_frag_start_receive+0x57)[0x7fe84e514977] [hpcsccs4:29012] [ 8] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(+0x13176)[0x7fe84e516176] [hpcsccs4:29012] [ 9] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_match+0x1dc)[0x7fe84f18daac] [hpcsccs4:29012] [10] /work/local/openmpi/lib/openmpi/mca_btl_self.so(mca_btl_self_send+0x40)[0x7fe8542a48e0] [hpcsccs4:29012] [11] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send_request_start_prepare+0xcd)[0x7fe84f199e8d] [hpcsccs4:29012] [12] /work/local/openmpi/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_start+0x411)[0x7fe84f19c0c1] [hpcsccs4:29012] [13] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(ompi_osc_pt2pt_isend_w_cb+0x50)[0x7fe84e513250] [hpcsccs4:29012] [14] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(+0x13db6)[0x7fe84e516db6] [hpcsccs4:29012] [15] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(+0x160c7)[0x7fe84e5190c7] [hpcsccs4:29012] [16] /work/local/openmpi/lib/openmpi/mca_osc_pt2pt.so(+0x16935)[0x7fe84e519935] [hpcsccs4:29012] [17] /work/local/openmpi/lib/libmpi.so.12(PMPI_Win_unlock+0xa7)[0x7fe858ef5a57] [hpcsccs4:29012] [18] ./a.out[0x40873a] [hpcsccs4:29012] [19] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fe8585b7ec5] [hpcsccs4:29012] [20] ./a.out[0x408589] [hpcsccs4:29012] *** End of error message *** -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 29012 on node hpcsccs4 exited on signal 11 (Segmentation fault).
Best regards, Sebastian -- Sebastian Rettenberger, M.Sc. Technische Universität München Department of Informatics Chair of Scientific Computing Boltzmannstrasse 3, 85748 Garching, Germany http://www5.in.tum.de/
#include <mpi.h> #include <unistd.h> #include <iostream> int main(int argc, char* argv[]) { int provided; MPI_Init(&argc, &argv); int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Win window; int* data; MPI_Alloc_mem(sizeof(int), MPI_INFO_NULL, &data); data[0] = 0; MPI_Win_create(data, sizeof(int), sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &window); int remoteRank = 0; if (rank == 0) { MPI_Win_lock(MPI_LOCK_EXCLUSIVE, remoteRank, MPI_MODE_NOCHECK, window); //MPI_Win_lock(MPI_LOCK_SHARED, 1, 0, window); int mydata; MPI_Get(&mydata, 1, MPI_INT, remoteRank, 0, 1, MPI_INT, window); MPI_Win_unlock(remoteRank, window); } MPI_Win_free(&window); MPI_Free_mem(data); MPI_Finalize(); return 0; }
smime.p7s
Description: S/MIME Cryptographic Signature