https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116046

            Bug ID: 116046
           Summary: vmovdqa64 is used when unaligned memory caused by
                    unaligned %rsp/%rbp
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: haochen.jiang at intel dot com
  Target Milestone: ---

Under x86-64-pc-linux-gnu, when I compiled test avx512f-vec-set-1.c with -O0 to
get an executable and ran it, I get a segmentation fault:

$ /export/users/haochenj/env/build_no_bootstrap_future/gcc/xgcc
-B/export/users/haochenj/env/build_no_bootstrap_future/gcc/
/export/users/haochenj/src/gcc/future/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c
 -m32   -fdiagnostics-plain-output   -O0 -mavx512f -mno-avx512bw  -lm  -o
./avx512f-vec-set-2.exe
$ ./avx512f-vec-set-2.exe
Segmentation fault (core dumped)

-O1 and clang are both ok.

Derived testcase: https://godbolt.org/z/dTxn7TG1c

The segmentation fault happened at the vmovdqa here:

.L4:
        ...
        leaq    -192(%rbp), %rdx
        vmovdqa64       (%rdx), %zmm0
        movl    $50, %esi
        movl    %eax, %edi
        call    foo_v64qi(char __vector(64), char, unsigned int)
        ...

The %rbp here is not 64 byte aligned here since at the beginning of test_512():

test_512():
        leaq    8(%rsp), %r10
        andq    $-64, %rsp
        pushq   -8(%r10)
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %r10
        ...

After %rsp is aligned to 64, we got another push and ruined the alignment.

For -O1, it is similar for %rbp/%rsp, but at the vmovdqa64, it used the exact
offset which get the memory aligned:

.L3:
        ...
        vmovdqa64       -176(%rbp), %zmm0
        call    foo_v64qi(char __vector(64), char, unsigned int)
        ...

For clang, it aligned %rsp after all the push and used vmovdqu64 for unaligned
memory.

test_512():                           # @test_512()
        pushq   %rbp
        movq    %rsp, %rbp
        andq    $-64, %rsp
        subq    $256, %rsp                      # imm = 0x100
        ...
.LBB1_5:
        ...
        vmovdqu64       160(%rsp), %zmm0
        movl    232(%rsp), %eax
        movl    $50, %esi
        movsbl  %al, %edi
        callq   foo_v64qi(char vector[64], char, unsigned int)
        ...

Probably that the issue exists since vmovdqa64 is introduced.

Reply via email to