https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81450
Bug ID: 81450 Summary: Typedef with assume aligned builtin yields segmentation fault in nested loop Product: gcc Version: 6.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: philipp.kopp at tum dot de CC: marxin at gcc dot gnu.org Target Milestone: --- Created attachment 41762 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41762&action=edit Small example with the working and the failing version Hi, I found that the assume aligned built-in gives a segmentation fault in nested loops when using it with a typedef (or the C++ 11 using keyword). I attached a small example code. I am using Ubuntu 17.04 with the standard gcc 6.3.0 from the Ubuntu archives. I am compiling with gcc -O3 -g. So SSE vectorization is enabled, without that the code runs fine. The core part where execution fails is the following: > typedef double __attribute__((aligned (32))) AlignedDouble; > const size_t size = 17; > double alpha = 1.0 / 3.0; > // uses posix_memalign (see full testcase) > AlignedDouble* A = aligned_doubles( size * size ); > for( size_t i = 0; i < size; ++i ) > { > printf( "i = %lu\n", i ); > for( size_t j = 0; j < size; ++j ) > { > A[j + i * size] += alpha; > } > } I checked the disassembly and found that the loop over j runs once cpompletely and fails the second time, as a temporary pointer to A is incremented by i * size, which will yield an unaligned pointer if size % 2 != 0 (or 4 with avx2). However, if the alignment is directly used in the definition of A, without using a typedef, the result is different. Looking at the output of -fopt-info-loop the loop is peeled for alignment in this case. With typedef: > gcc -O3 -g main.cpp -o main -fopt-info-loop > main.cpp:40:26: note: loop vectorized > main.cpp:40:26: note: loop turned into non-loop; it never loops > main.cpp:15:5: note: loop turned into non-loop; it never loops. > main.cpp:15:5: note: loop with 8 iterations completely unrolled Without typedef: > gcc -O3 -g main.cpp -o main -fopt-info-loop > main.cpp:30:26: note: loop vectorized > main.cpp:30:26: note: loop peeled for vectorization to enhance alignment > main.cpp:30:26: note: loop turned into non-loop; it never loops > main.cpp:15:5: note: loop turned into non-loop; it never loops. > main.cpp:15:5: note: loop with 9 iterations completely unrolled > main.cpp:15:5: note: loop turned into non-loop; it never loops In the attached test case you find for both scenarios the code, the binary, an objdump, as well as outputs from -fopt-info-loop and -fopt-info-all. In the objdum I marked the increment of A as well as the instruction where the segmentation fault happens. Please let me know if more info is needed. Thanks a lot for your time! Best wishes, Philipp Kopp