https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116046
Bug ID: 116046 Summary: vmovdqa64 is used when unaligned memory caused by unaligned %rsp/%rbp Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: haochen.jiang at intel dot com Target Milestone: --- Under x86-64-pc-linux-gnu, when I compiled test avx512f-vec-set-1.c with -O0 to get an executable and ran it, I get a segmentation fault: $ /export/users/haochenj/env/build_no_bootstrap_future/gcc/xgcc -B/export/users/haochenj/env/build_no_bootstrap_future/gcc/ /export/users/haochenj/src/gcc/future/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c -m32 -fdiagnostics-plain-output -O0 -mavx512f -mno-avx512bw -lm -o ./avx512f-vec-set-2.exe $ ./avx512f-vec-set-2.exe Segmentation fault (core dumped) -O1 and clang are both ok. Derived testcase: https://godbolt.org/z/dTxn7TG1c The segmentation fault happened at the vmovdqa here: .L4: ... leaq -192(%rbp), %rdx vmovdqa64 (%rdx), %zmm0 movl $50, %esi movl %eax, %edi call foo_v64qi(char __vector(64), char, unsigned int) ... The %rbp here is not 64 byte aligned here since at the beginning of test_512(): test_512(): leaq 8(%rsp), %r10 andq $-64, %rsp pushq -8(%r10) pushq %rbp movq %rsp, %rbp pushq %r10 ... After %rsp is aligned to 64, we got another push and ruined the alignment. For -O1, it is similar for %rbp/%rsp, but at the vmovdqa64, it used the exact offset which get the memory aligned: .L3: ... vmovdqa64 -176(%rbp), %zmm0 call foo_v64qi(char __vector(64), char, unsigned int) ... For clang, it aligned %rsp after all the push and used vmovdqu64 for unaligned memory. test_512(): # @test_512() pushq %rbp movq %rsp, %rbp andq $-64, %rsp subq $256, %rsp # imm = 0x100 ... .LBB1_5: ... vmovdqu64 160(%rsp), %zmm0 movl 232(%rsp), %eax movl $50, %esi movsbl %al, %edi callq foo_v64qi(char vector[64], char, unsigned int) ... Probably that the issue exists since vmovdqa64 is introduced.