https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110235

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuho...@gcc.gnu.org>:

https://gcc.gnu.org/g:f8e02702726d4514b8ff9f5481c9c1f5d34e1787

commit r14-1917-gf8e02702726d4514b8ff9f5481c9c1f5d34e1787
Author: liuhongt <hongtao....@intel.com>
Date:   Thu Jun 15 16:46:14 2023 +0800

    Refined 256/512-bit vpacksswb/vpackssdw patterns.

    The packing in vpacksswb/vpackssdw is not a simple concat, it's an
    interweave from src1 and src2 for every 128 bit(or 64-bit for the
    ss_truncate result).

    .i.e.

    dst[192-255] = ss_truncate (src2[128-255])
    dst[128-191] = ss_truncate (src1[128-255])
    dst[64-127] = ss_truncate (src2[0-127])
    dst[0-63] = ss_truncate (src1[0-127]

    The patch refined those patterns with an extra vec_select for the
    interweave.

    gcc/ChangeLog:

            PR target/110235
            * config/i386/sse.md (<sse2_avx2>_packsswb<mask_name>):
            Substitute with ..
            (sse2_packsswb<mask_name>): .. this, ..
            (avx2_packsswb<mask_name>): .. this and ..
            (avx512bw_packsswb<mask_name>): .. this.
            (<sse2_avx2>_packssdw<mask_name>): Substitute with ..
            (sse2_packssdw<mask_name>): .. this, ..
            (avx2_packssdw<mask_name>): .. this and ..
            (avx512bw_packssdw<mask_name>): .. this.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/avx512bw-vpackssdw-3.c: New test.
            * gcc.target/i386/avx512bw-vpacksswb-3.c: New test.

Reply via email to