Joseph,

thanks for the report and the test program.


the memory allocated by MPI_Win_allocate_shared() is indeed aligned on (4*communicator_size).

i could not reproduce such a thing with MPI_Win_allocate(), but will investigate it.


i fixed MPI_Win_allocate_shared() in https://github.com/open-mpi/ompi/pull/2978,

meanwhile, you can manually download and apply the patch at https://github.com/open-mpi/ompi/pull/2978.patch


Cheers,


Gilles


On 2/14/2017 11:01 PM, Joseph Schuchart wrote:
Hi,

We have been experiencing strange crashes in our application that mostly works on memory allocated through MPI_Win_allocate and MPI_Win_allocate_shared. We eventually realized that the application crashes if it is compiled with -O3 or -Ofast and run with an odd number of processors on our x86_64 machines.

After some debugging we found that the minimum alignment of the memory returned by MPI_Win_allocate is 4 Bytes, which is fine for 32b data types but causes problems with 64b data types (such as size_t) and automatic loop vectorization (tested with GCC 5.3.0). Here the compiler assumes a natural alignment, which should be at least 8 Byte on x86_64 and is guaranteed by malloc and new.

Interestingly, the alignment of the returned memory depends on the number of processes running. I am attaching a small reproducer that prints the alignments of memory returned by MPI_Win_alloc, MPI_Win_alloc_shared, and MPI_Alloc_mem (the latter seems to be fine).

Example for 2 processes (correct alignment):

[MPI_Alloc_mem] Alignment of baseptr=0x260ac60: 32
[MPI_Win_allocate] Alignment of baseptr=0x7f94d7aa30a8: 40
[MPI_Win_allocate_shared] Alignment of baseptr=0x7f94d7aa30a8: 40

Example for 3 processes (alignment 4 Bytes even with 8 Byte displacement unit):

[MPI_Alloc_mem] Alignment of baseptr=0x115e970: 48
[MPI_Win_allocate] Alignment of baseptr=0x7f685f50f0c4: 4
[MPI_Win_allocate_shared] Alignment of baseptr=0x7fec618bc0c4: 4

Is this a known issue? I expect users to rely on basic alignment guarantees made by malloc/new to be true for any function providing malloc-like behavior, even more so as a hint on the alignment requirements is passed to MPI_Win_alloc in the form of the disp_unit argument.

I was able to reproduce this issue in both OpenMPI 1.10.5 and 2.0.2. I also tested with MPICH, which provides correct alignment.

Cheers,
Joseph



_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to