On Tue, Aug 23, 2022 at 6:10 PM Alexander Monakov via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > The crc32q instruction takes 64-bit operands, but ignores high 32 bits > of the destination operand, and zero-extends the result from 32 bits. > > Let's model this in the RTL pattern to avoid zero-extension when the > _mm_crc32_u64 intrinsic is used with a 32-bit type. > > PR target/106453 > > gcc/ChangeLog: > > * config/i386/i386.md (sse4_2_crc32di): Model that only low 32 > bits of operand 0 are consumed, and the result is zero-extended > to 64 bits. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr106453.c: New test.
OK with a nit and a couple of changes to the testcase dg-directives. > --- > gcc/config/i386/i386.md | 6 +++--- > gcc/testsuite/gcc.target/i386/pr106453.c | 13 +++++++++++++ > 2 files changed, 16 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr106453.c > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 58fcc382f..b5760bb23 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -23823,10 +23823,10 @@ > > (define_insn "sse4_2_crc32di" > [(set (match_operand:DI 0 "register_operand" "=r") > - (unspec:DI > - [(match_operand:DI 1 "register_operand" "0") > + (zero_extend:DI (unspec:SI > + [(match_operand:SI 1 "register_operand" "0") > (match_operand:DI 2 "nonimmediate_operand" "rm")] > - UNSPEC_CRC32))] > + UNSPEC_CRC32)))] Usually the (unspec) part comes in the next line. > "TARGET_64BIT && TARGET_CRC32" > "crc32{q}\t{%2, %0|%0, %2}" > [(set_attr "type" "sselog1") > diff --git a/gcc/testsuite/gcc.target/i386/pr106453.c > b/gcc/testsuite/gcc.target/i386/pr106453.c > new file mode 100644 > index 000000000..bab5b1cb2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr106453.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-options "-msse4.2 -O2 -fdump-rtl-final" } */ > +/* { dg-final { scan-rtl-dump-not "zero_extendsidi" "final" } } */ This part can use scan-asembler-not directive, with a -dp compiler option: +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mcrc32 -dp" } */ +/* { dg-final { scan-assembler-not "zero_extendsidi" } } */ Also, the mainline compiler can use -mcrc32. Please find all these suggestions implemented in the attached patch. Thanks, Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 1aef1af594d..57771ed84f5 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -23823,10 +23823,11 @@ (define_insn "sse4_2_crc32<mode>" (define_insn "sse4_2_crc32di" [(set (match_operand:DI 0 "register_operand" "=r") - (unspec:DI - [(match_operand:DI 1 "register_operand" "0") - (match_operand:DI 2 "nonimmediate_operand" "rm")] - UNSPEC_CRC32))] + (zero_extend:DI + (unspec:SI + [(match_operand:SI 1 "register_operand" "0") + (match_operand:DI 2 "nonimmediate_operand" "rm")] + UNSPEC_CRC32)))] "TARGET_64BIT && TARGET_CRC32" "crc32{q}\t{%2, %0|%0, %2}" [(set_attr "type" "sselog1") diff --git a/gcc/testsuite/gcc.target/i386/pr106453.c b/gcc/testsuite/gcc.target/i386/pr106453.c new file mode 100644 index 00000000000..bd2e7282cf6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106453.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mcrc32 -dp" } */ +/* { dg-final { scan-assembler-not "zero_extendsidi" } } */ + +#include <immintrin.h> +#include <stdint.h> + +uint32_t f(uint32_t c, uint64_t *p, size_t n) +{ + for (size_t i = 0; i < n; i++) + c = _mm_crc32_u64(c, p[i]); + return c; +}