Gilles, all,
I can confirm that the fix has landed in OpenMPI 2.1.1. Unfortunately,
1.10.7 still provides 4-Byte-aligned memory. Will this be fixed in the
1.x branch at some point? We are selectively enabling the use of shared
memory windows when using OpenMPI so it would be interesting to know
whether it's sufficient to check for the 2.x branch :)
Best
Joseph
On 02/15/2017 05:45 AM, Gilles Gouaillardet wrote:
Joseph,
thanks for the report and the test program.
the memory allocated by MPI_Win_allocate_shared() is indeed aligned on
(4*communicator_size).
i could not reproduce such a thing with MPI_Win_allocate(), but will
investigate it.
i fixed MPI_Win_allocate_shared() in
https://github.com/open-mpi/ompi/pull/2978,
meanwhile, you can manually download and apply the patch at
https://github.com/open-mpi/ompi/pull/2978.patch
Cheers,
Gilles
On 2/14/2017 11:01 PM, Joseph Schuchart wrote:
Hi,
We have been experiencing strange crashes in our application that
mostly works on memory allocated through MPI_Win_allocate and
MPI_Win_allocate_shared. We eventually realized that the application
crashes if it is compiled with -O3 or -Ofast and run with an odd
number of processors on our x86_64 machines.
After some debugging we found that the minimum alignment of the memory
returned by MPI_Win_allocate is 4 Bytes, which is fine for 32b data
types but causes problems with 64b data types (such as size_t) and
automatic loop vectorization (tested with GCC 5.3.0). Here the
compiler assumes a natural alignment, which should be at least 8 Byte
on x86_64 and is guaranteed by malloc and new.
Interestingly, the alignment of the returned memory depends on the
number of processes running. I am attaching a small reproducer that
prints the alignments of memory returned by MPI_Win_alloc,
MPI_Win_alloc_shared, and MPI_Alloc_mem (the latter seems to be fine).
Example for 2 processes (correct alignment):
[MPI_Alloc_mem] Alignment of baseptr=0x260ac60: 32
[MPI_Win_allocate] Alignment of baseptr=0x7f94d7aa30a8: 40
[MPI_Win_allocate_shared] Alignment of baseptr=0x7f94d7aa30a8: 40
Example for 3 processes (alignment 4 Bytes even with 8 Byte
displacement unit):
[MPI_Alloc_mem] Alignment of baseptr=0x115e970: 48
[MPI_Win_allocate] Alignment of baseptr=0x7f685f50f0c4: 4
[MPI_Win_allocate_shared] Alignment of baseptr=0x7fec618bc0c4: 4
Is this a known issue? I expect users to rely on basic alignment
guarantees made by malloc/new to be true for any function providing
malloc-like behavior, even more so as a hint on the alignment
requirements is passed to MPI_Win_alloc in the form of the disp_unit
argument.
I was able to reproduce this issue in both OpenMPI 1.10.5 and 2.0.2. I
also tested with MPICH, which provides correct alignment.
Cheers,
Joseph
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: schuch...@hlrs.de
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users