Gilles,

I downloaded and built openmpi-2.0.0rc2 and used that for the test. I get a 
crash on more than 1 processor for the lock/unlock protocol with the error 
message

[node005:29916] *** An error occurred in MPI_Win_lock
[node005:29916] *** reported by process [3736862721,6]
[node005:29916] *** on win rdma window 3
[node005:29916] *** MPI_ERR_RMA_SYNC: error executing rma sync
[node005:29916] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[node005:29916] ***    and potentially your MPI job)

and the request-based protocol hangs on the MPI_Rget call. The flush_local 
protocol seems to work though. Unlike 1.8.3, the problems seem to occur no 
matter what the value of NSIZE is. Should I try actually building 1.10 after 
applying the patch to it?

Bruce

Message: 1
List-Post: users@lists.open-mpi.org
Date: Mon, 2 May 2016 13:42:21 +0900
From: Gilles Gouaillardet <gil...@rist.or.jp>
To: Open MPI Users <us...@open-mpi.org>
Subject: Re: [OMPI users] MPI Datatypes and RMA
Message-ID: <01c20fdf-c41b-96a8-6732-661745ddf...@rist.or.jp>
Content-Type: text/plain; charset="windows-1252"; Format="flowed"

Bruce,


this issue was previously fixed on master and v2.x, but for some reasons, the 
fix was not backported to v1.10

i made a PR at https://github.com/open-mpi/ompi-release/pull/1120/files

in the mean time, feel free to manually apply the patch at 
https://patch-diff.githubusercontent.com/raw/open-mpi/ompi-release/pull/1120.patch


Cheers,


Gilles

Reply via email to