https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94680

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuho...@gcc.gnu.org>:

https://gcc.gnu.org/g:94de7e225c1fda079052c3f0725c926437d56c94

commit r12-760-g94de7e225c1fda079052c3f0725c926437d56c94
Author: liuhongt <hongtao....@intel.com>
Date:   Thu Apr 22 15:33:16 2021 +0800

    Optimize __builtin_shuffle when it's used to zero the upper bits of the
dest. [PR target/94680]

    If the second operand of __builtin_shuffle is const vector 0, and with
    specific mask, it can be optimized to movq/vmovps.

    .i.e.
    foo128:
    -       vxorps  %xmm1, %xmm1, %xmm1
    -       vmovlhps        %xmm1, %xmm0, %xmm0
    +       vmovq   %xmm0, %xmm0

     foo256:
    -       vxorps  %xmm1, %xmm1, %xmm1
    -       vshuff32x4      $0, %ymm1, %ymm0, %ymm0
    +       vmovaps %xmm0, %xmm0

     foo512:
    -       vxorps  %xmm1, %xmm1, %xmm1
    -       vshuff32x4      $68, %zmm1, %zmm0, %zmm0
    +       vmovaps %ymm0, %ymm0

    gcc/ChangeLog:

            PR target/94680
            * config/i386/sse.md (ssedoublevecmode): Add attribute for
            V64QI/V32HI/V16SI/V4DI.
            (ssehalfvecmode): Add attribute for V2DI/V2DF.
            (*vec_concatv4si_0): Extend to VI124_128.
            (*vec_concat<mode>_0): New pre-reload splitter.
            * config/i386/predicates.md (movq_parallel): New predicate.

    gcc/testsuite/ChangeLog:

            PR target/94680
            * gcc.target/i386/avx-pr94680.c: New test.
            * gcc.target/i386/avx512f-pr94680.c: New test.
            * gcc.target/i386/sse2-pr94680.c: New test.

Reply via email to