Bruce,
this issue was previously fixed on master and v2.x, but for some
reasons, the fix was not backported to v1.10
i made a PR at https://github.com/open-mpi/ompi-release/pull/1120/files
in the mean time, feel free to manually apply the patch at
https://patch-diff.githubusercontent.com/raw/open-mpi/ompi-release/pull/1120.patch
Cheers,
Gilles
On 4/30/2016 7:40 AM, Palmer, Bruce J wrote:
I’ve been trying to recreate the semantics of the Global Array gather
and scatter operations using MPI RMA routines and I’ve run into some
issues with MPI Datatypes. I’ve been focusing on building MPI versions
of the GA gather and scatter calls, which I’ve been trying to
implement using MPI data types built with the MPI_Type_create_struct
call. I’ve developed a test program that simulates copying data into
and out of a 1D distributed array of size NSIZE. Each processor
contains a segment of approximately size NSIZE/nproc and is
responsible for assigning every nprocth value in the array starting
with the value indexed by the rank of the array. After assigning
values and synchronizing the distributed data structure, each
processor then reads the values set by the processor of next higher
rank (the process with rank nproc-1 reads the values set by process 0).
The distributed array is represented by and MPI window and created
using a standard MPI_Win_create call. The values in the array are set
and read using MPI RMA operations, either MPI_Get/MPI_Put or
MPI_Rget/MPI_Rput. Three different protocols have been used. The first
is to call MPI_Win_lock and create a shared lock on the remote
processor, then call MPI_Put/MPI_Get and then call MPI_Win_unlock to
clear the lock. The second protocol is to use MPI request-based calls.
After the call to MPI_Win_create, MPI_Win_lock_all is called to start
a passive synchronization epoch on the window. Data is written and
read to the distributed array using MPI_Rput/MPI_Rget immediately
followed by a call to MPI_Wait, using the handle returned by the
MPI_Rput/MPI_Rget call. The third protocol also immediately creates a
passive synchronization epoch after window creation, but uses calls to
MPI_Put/MPI_Get immediately followed by a call to MPI_Win_flush_local.
These three protocols seem to cover all the possibilities that I have
seen in other MPI/RMA based implementations of ARMCI/GA.
The issue that I’ve run into is that these tests seem to work reliably
if I build the data type using the MPI_Type_create_subbarray function
but fail for larger arrays (NSIZE ~ 10000) when I use
MPI_Type_create_struct. Because the values being set by each processor
are evenly spaced, I can use either function in this case (this is not
generally true in applications). The struct data type hangs on 2
processors using lock/unlock, crashes for the request-based protocol
and does not get the correct values in the Get phase of the data
transfer when using flush_local. These tests are done on a Linux
cluster using an Infiniband interconnect and the value of NSIZE is
10000. For comparison, the same test using MPI_Type_create_subarray
seems to function reliably for all three protocols for NSIZE=1000000
using 1,2,8 processors on 1 and 2 SMP nodes.
I’ve attached the test program for these test cases. Does anyone have
a suggestion about what might be going on here?
Bruce
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/04/29059.php