https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89346

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <h...@gcc.gnu.org>:

https://gcc.gnu.org/g:5358e8f5800daa0012fc9d06705d64bbb21fa07b

commit r10-7054-g5358e8f5800daa0012fc9d06705d64bbb21fa07b
Author: H.J. Lu <hjl.to...@gmail.com>
Date:   Thu Mar 5 16:45:05 2020 -0800

    i386: Properly encode vector registers in vector move

    On x86, when AVX and AVX512 are enabled, vector move instructions can
    be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):

       0:       c5 f9 6f d1             vmovdqa %xmm1,%xmm2
       4:       62 f1 fd 08 6f d1       vmovdqa64 %xmm1,%xmm2

    We prefer VEX encoding over EVEX since VEX is shorter.  Also AVX512F
    only supports 512-bit vector moves.  AVX512F + AVX512VL supports 128-bit
    and 256-bit vector moves.  xmm16-xmm31 and ymm16-ymm31 are disallowed in
    128-bit and 256-bit modes when AVX512VL is disabled.  Mode attributes on
    x86 vector move patterns indicate target preferences of vector move
    encoding.  For scalar register to register move, we can use 512-bit
    vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
    available.  With AVX512F and AVX512VL, we should use VEX encoding for
    128-bit/256-bit vector moves if upper 16 vector registers aren't used.
    This patch adds a function, ix86_output_ssemov, to generate vector moves:

    1. If zmm registers are used, use EVEX encoding.
    2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
    will be generated.
    3. If xmm16-xmm31/ymm16-ymm31 registers are used:
       a. With AVX512VL, AVX512VL vector moves will be generated.
       b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
          move will be done with zmm register move.

    There is no need to set mode attribute to XImode explicitly since
    ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
    with and without AVX512VL.

    Tested on AVX2 and AVX512 with and without --with-arch=native.

    gcc/

        PR target/89229
        PR target/89346
        * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
        * config/i386/i386.c (ix86_get_ssemov): New function.
        (ix86_output_ssemov): Likewise.
        * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
        ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
        check.
        (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
        (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
        Remove ext_sse_reg_operand and TARGET_AVX512VL check.
        (*movti_internal): Likewise.
        (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.

    gcc/testsuite/

        PR target/89229
        PR target/89346
        * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
        * gcc.target/i386/pr89229-2a.c: New test.
        * gcc.target/i386/pr89229-2b.c: Likewise.
        * gcc.target/i386/pr89229-2c.c: Likewise.
        * gcc.target/i386/pr89229-3a.c: Likewise.
        * gcc.target/i386/pr89229-3b.c: Likewise.
        * gcc.target/i386/pr89229-3c.c: Likewise.
        * gcc.target/i386/pr89346.c: Likewise.

Reply via email to