https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114987

--- Comment #7 from Haochen Jiang <haochen.jiang at intel dot com> ---
Furthermore, when I build with GCC11, the codegen is much better:

        vaddps       0xc0(%rsp),%ymm5,%ymm2
        vaddps       0xe0(%rsp),%ymm4,%ymm1
        vmovaps      %ymm2,0x80(%rsp)
        vmovdqa      0x90(%rsp),%xmm6
        vmovaps      %ymm1,0xa0(%rsp)
        vmovdqa      0xb0(%rsp),%xmm7
        vmovdqa      %xmm2,0xc0(%rsp)
        vmovdqa      %xmm6,0xd0(%rsp)
        vmovdqa      %xmm1,0xe0(%rsp)
        vmovdqa      %xmm7,0xf0(%rsp)
        sub          $0x1,%eax
        jne          401e00 <stress_vecfp_float_add_16.avx.1+0x1e0>

Seems we might get two separate issues for this regression.

Reply via email to