On Sat, Aug 24, 2024 at 9:22 AM Mariam Arutunian
<mariamarutun...@gmail.com> wrote:
>
>
>
> On Fri, Aug 23, 2024, 15:03 Richard Biener <richard.guent...@gmail.com> wrote:
>>
>> On Fri, Aug 23, 2024 at 9:55 AM Mariam Arutunian
>> <mariamarutun...@gmail.com> wrote:
>> >
>> >
>> > On Wed, Aug 21, 2024 at 5:56 PM Richard Sandiford 
>> > <richard.sandif...@arm.com> wrote:
>> >>
>> >> Mariam Arutunian <mariamarutun...@gmail.com> writes:
>> >> > This patch introduces two new expanders for the aarch64 backend,
>> >> > dedicated to generate optimized code for CRC computations.
>> >> > The new expanders are designed to leverage specific hardware 
>> >> > capabilities
>> >> > to achieve faster CRC calculations,
>> >> > particularly using the crc32, crc32c and pmull instructions when 
>> >> > supported
>> >> > by the target architecture.
>> >> >
>> >> > Expander 1: Bit-Forward CRC (crc<ALLI:mode><ALLX:mode>4)
>> >> > For targets that support pmul instruction (TARGET_AES),
>> >> > the expander will generate code that uses the pmull (crypto_pmulldi)
>> >> > instruction for CRC computation.
>> >> >
>> >> > Expander 2: Bit-Reversed CRC (crc_rev<ALLI:mode><ALLX:mode>4)
>> >> > The expander first checks if the target supports the CRC32* instruction 
>> >> > set
>> >> > (TARGET_CRC32)
>> >> > and the polynomial in use is 0x1EDC6F41 (iSCSI) or 0x04C11DB7 (HDLC). If
>> >> > the conditions are met,
>> >> > it emits calls to the corresponding crc32* instruction (depending on the
>> >> > data size and the polynomial).
>> >> > If the target does not support crc32* but supports pmull, it then uses 
>> >> > the
>> >> > pmull (crypto_pmulldi) instruction for bit-reversed CRC computation.
>> >> > Otherwise table-based CRC is generated.
>> >> >
>> >> >   gcc/config/aarch64/
>> >> >
>> >> >     * aarch64-protos.h (aarch64_expand_crc_using_pmull): New extern
>> >> > function declaration.
>> >> >     (aarch64_expand_reversed_crc_using_pmull):  Likewise.
>> >> >     * aarch64.cc (aarch64_expand_crc_using_pmull): New function.
>> >> >     (aarch64_expand_reversed_crc_using_pmull):  Likewise.
>> >> >     * aarch64.md (crc_rev<ALLI:mode><ALLX:mode>4): New expander for
>> >> > reversed CRC.
>> >> >     (crc<ALLI:mode><ALLX:mode>4): New expander for bit-forward CRC.
>> >> >     * iterators.md (crc_data_type): New mode attribute.
>> >> >
>> >> >   gcc/testsuite/gcc.target/aarch64/
>> >> >
>> >> >     * crc-1-pmul.c: New test.
>> >> >     * crc-10-pmul.c: Likewise.
>> >> >     * crc-12-pmul.c: Likewise.
>> >> >     * crc-13-pmul.c: Likewise.
>> >> >     * crc-14-pmul.c: Likewise.
>> >> >     * crc-17-pmul.c: Likewise.
>> >> >     * crc-18-pmul.c: Likewise.
>> >> >     * crc-21-pmul.c: Likewise.
>> >> >     * crc-22-pmul.c: Likewise.
>> >> >     * crc-23-pmul.c: Likewise.
>> >> >     * crc-4-pmul.c: Likewise.
>> >> >     * crc-5-pmul.c: Likewise.
>> >> >     * crc-6-pmul.c: Likewise.
>> >> >     * crc-7-pmul.c: Likewise.
>> >> >     * crc-8-pmul.c: Likewise.
>> >> >     * crc-9-pmul.c: Likewise.
>> >> >     * crc-CCIT-data16-pmul.c: Likewise.
>> >> >     * crc-CCIT-data8-pmul.c: Likewise.
>> >> >     * crc-coremark-16bitdata-pmul.c: Likewise.
>> >> >     * crc-crc32-data16.c: Likewise.
>> >> >     * crc-crc32-data32.c: Likewise.
>> >> >     * crc-crc32-data8.c: Likewise.
>> >> >     * crc-crc32c-data16.c: Likewise.
>> >> >     * crc-crc32c-data32.c: Likewise.
>> >> >     * crc-crc32c-data8.c: Likewise.
>> >>
>> >> OK for trunk once the prerequisites are approved.  Thanks for all your
>> >> work on this.
>> >>
>> >> Which other parts of the series still need review?  I can try to help
>> >> out with the target-independent bits.  (That said, I'm not sure I'm the
>> >> best person to review the tree recognition pass, but I can have a go.)
>> >>
>> >
>> > Thank you very much for everything.
>> > Right now, I'm not sure which parts would be best to be reviewed since 
>> > Richard Biener is currently reviewing them.
>> > Maybe I can ask for your help later?
>>
>> I'm done with the parts I preserved for reviewing.  Btw, it seems the
>> vN series are not
>> complete, that is, you didn't re-post the entire series but only
>> changed parts?  I was
>> somewhat confused by that.
>
>
> Yes, I didn't re-post the entire series; I only resent the parts that were 
> modified. I didn't know that I needed to send the entire series each time. 
> I'll make sure to do that in the next versions.

It's fine to only post changed parts, I just missed a note that you
did this so was
searching for a revised version of an older patch I had in the queue for
reviewing.  But re-posting the entire series is fine as well and probably
the least confusing to everyone (including the pre-commit CI).

Richard.

> Thanks,
> Mariam
>
>
>>
>> Richard.
>>
>> > Thanks,
>> > Mariam
>> >
>> >> Richard
>> >>
>> >> >
>> >> > Signed-off-by: Mariam Arutunian <mariamarutun...@gmail.com>
>> >> > Co-authored-by: Richard Sandiford <richard.sandif...@arm.com>
>> >> > diff --git a/gcc/config/aarch64/aarch64-protos.h 
>> >> > b/gcc/config/aarch64/aarch64-protos.h
>> >> > index 42639e9efcf..469111e3b17 100644
>> >> > --- a/gcc/config/aarch64/aarch64-protos.h
>> >> > +++ b/gcc/config/aarch64/aarch64-protos.h
>> >> > @@ -1112,5 +1112,8 @@ extern void aarch64_adjust_reg_alloc_order ();
>> >> >
>> >> >  bool aarch64_optimize_mode_switching (aarch64_mode_entity);
>> >> >  void aarch64_restore_za (rtx);
>> >> > +void aarch64_expand_crc_using_pmull (scalar_mode, scalar_mode, rtx *);
>> >> > +void aarch64_expand_reversed_crc_using_pmull (scalar_mode, 
>> >> > scalar_mode, rtx *);
>> >> > +
>> >> >
>> >> >  #endif /* GCC_AARCH64_PROTOS_H */
>> >> > diff --git a/gcc/config/aarch64/aarch64.cc 
>> >> > b/gcc/config/aarch64/aarch64.cc
>> >> > index 7f0cc47d0f0..0cb8f3e8090 100644
>> >> > --- a/gcc/config/aarch64/aarch64.cc
>> >> > +++ b/gcc/config/aarch64/aarch64.cc
>> >> > @@ -30314,6 +30314,137 @@ aarch64_retrieve_sysreg (const char *regname, 
>> >> > bool write_p, bool is128op)
>> >> >    return sysreg->encoding;
>> >> >  }
>> >> >
>> >> > +/* Generate assembly to calculate CRC
>> >> > +   using carry-less multiplication instruction.
>> >> > +   OPERANDS[1] is input CRC,
>> >> > +   OPERANDS[2] is data (message),
>> >> > +   OPERANDS[3] is the polynomial without the leading 1.  */
>> >> > +
>> >> > +void
>> >> > +aarch64_expand_crc_using_pmull (scalar_mode crc_mode,
>> >> > +                             scalar_mode data_mode,
>> >> > +                             rtx *operands)
>> >> > +{
>> >> > +  /* Check and keep arguments.  */
>> >> > +  gcc_assert (!CONST_INT_P (operands[0]));
>> >> > +  gcc_assert (CONST_INT_P (operands[3]));
>> >> > +  rtx crc = operands[1];
>> >> > +  rtx data = operands[2];
>> >> > +  rtx polynomial = operands[3];
>> >> > +
>> >> > +  unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode);
>> >> > +  unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode);
>> >> > +  gcc_assert (crc_size <= 32);
>> >> > +  gcc_assert (data_size <= crc_size);
>> >> > +
>> >> > +  /* Calculate the quotient.  */
>> >> > +  unsigned HOST_WIDE_INT
>> >> > +      q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size);
>> >> > +  /* CRC calculation's main part.  */
>> >> > +  if (crc_size > data_size)
>> >> > +    crc = expand_shift (RSHIFT_EXPR, DImode, crc, crc_size - data_size,
>> >> > +                     NULL_RTX, 1);
>> >> > +
>> >> > +  rtx t0 = force_reg (DImode, gen_int_mode (q, DImode));
>> >> > +  polynomial = simplify_gen_unary (ZERO_EXTEND, DImode, polynomial,
>> >> > +                                GET_MODE (polynomial));
>> >> > +  rtx t1 = force_reg (DImode, polynomial);
>> >> > +
>> >> > +  rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1,
>> >> > +                      OPTAB_WIDEN);
>> >> > +
>> >> > +  rtx pmull_res = gen_reg_rtx (TImode);
>> >> > +  emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0));
>> >> > +  a0 = gen_lowpart (DImode, pmull_res);
>> >> > +
>> >> > +  a0 = expand_shift (RSHIFT_EXPR, DImode, a0, crc_size, NULL_RTX, 1);
>> >> > +
>> >> > +  emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1));
>> >> > +  a0 = gen_lowpart (DImode, pmull_res);
>> >> > +
>> >> > +  if (crc_size > data_size)
>> >> > +    {
>> >> > +      rtx crc_part = expand_shift (LSHIFT_EXPR, DImode, operands[1], 
>> >> > data_size,
>> >> > +                                NULL_RTX, 0);
>> >> > +      a0 = expand_binop (DImode, xor_optab, a0, crc_part, NULL_RTX, 1,
>> >> > +                      OPTAB_DIRECT);
>> >> > +    }
>> >> > +
>> >> > +  aarch64_emit_move (operands[0], gen_lowpart (crc_mode, a0));
>> >> > +}
>> >> > +
>> >> > +/* Generate assembly to calculate reversed CRC
>> >> > +   using carry-less multiplication instruction.
>> >> > +   OPERANDS[1] is input CRC,
>> >> > +   OPERANDS[2] is data,
>> >> > +   OPERANDS[3] is the polynomial without the leading 1.  */
>> >> > +
>> >> > +void
>> >> > +aarch64_expand_reversed_crc_using_pmull (scalar_mode crc_mode,
>> >> > +                                      scalar_mode data_mode,
>> >> > +                                      rtx *operands)
>> >> > +{
>> >> > +  /* Check and keep arguments.  */
>> >> > +  gcc_assert (!CONST_INT_P (operands[0]));
>> >> > +  gcc_assert (CONST_INT_P (operands[3]));
>> >> > +  rtx crc = operands[1];
>> >> > +  rtx data = operands[2];
>> >> > +  rtx polynomial = operands[3];
>> >> > +
>> >> > +  unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode);
>> >> > +  unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode);
>> >> > +  gcc_assert (crc_size <= 32);
>> >> > +  gcc_assert (data_size <= crc_size);
>> >> > +
>> >> > +  /* Calculate the quotient.  */
>> >> > +  unsigned HOST_WIDE_INT
>> >> > +      q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size);
>> >> > +  /* Reflect the calculated quotient.  */
>> >> > +  q = reflect_hwi (q, crc_size + 1);
>> >> > +  rtx t0 = force_reg (DImode, gen_int_mode (q, DImode));
>> >> > +
>> >> > +  /* Reflect the polynomial.  */
>> >> > +  unsigned HOST_WIDE_INT ref_polynomial = reflect_hwi (UINTVAL 
>> >> > (polynomial),
>> >> > +                                                    crc_size);
>> >> > +  /* An unshifted multiplier would require the final result to be 
>> >> > extracted
>> >> > +     using a shift right by DATA_SIZE - 1 bits.  Shift the multiplier 
>> >> > left
>> >> > +     so that the shift right can be by CRC_SIZE bits instead.  */
>> >> > +  ref_polynomial <<= crc_size - data_size + 1;
>> >> > +  rtx t1 = force_reg (DImode, gen_int_mode (ref_polynomial, DImode));
>> >> > +
>> >> > +  /* CRC calculation's main part.  */
>> >> > +  rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1,
>> >> > +                      OPTAB_WIDEN);
>> >> > +
>> >> > +  /* Perform carry-less multiplication and get low part.  */
>> >> > +  rtx pmull_res = gen_reg_rtx (TImode);
>> >> > +  emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0));
>> >> > +  a0 = gen_lowpart (DImode, pmull_res);
>> >> > +
>> >> > +  a0 = expand_binop (DImode, and_optab, a0,
>> >> > +                  gen_int_mode (GET_MODE_MASK (data_mode), DImode),
>> >> > +                  NULL_RTX, 1, OPTAB_WIDEN);
>> >> > +
>> >> > +  /* Perform carry-less multiplication.  */
>> >> > +  emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1));
>> >> > +
>> >> > +  /* Perform a shift right by CRC_SIZE as an extraction of lane 1.  */
>> >> > +  machine_mode crc_vmode = aarch64_vq_mode (crc_mode).require ();
>> >> > +  a0 = (crc_size > data_size ? gen_reg_rtx (crc_mode) : operands[0]);
>> >> > +  emit_insn (gen_aarch64_get_lane (crc_vmode, a0,
>> >> > +                                gen_lowpart (crc_vmode, pmull_res),
>> >> > +                                aarch64_endian_lane_rtx (crc_vmode, 
>> >> > 1)));
>> >> > +
>> >> > +  if (crc_size > data_size)
>> >> > +    {
>> >> > +      rtx crc_part = expand_shift (RSHIFT_EXPR, crc_mode, crc, 
>> >> > data_size,
>> >> > +                                NULL_RTX, 1);
>> >> > +      a0 = expand_binop (crc_mode, xor_optab, a0, crc_part, 
>> >> > operands[0], 1,
>> >> > +                      OPTAB_WIDEN);
>> >> > +      aarch64_emit_move (operands[0], a0);
>> >> > +    }
>> >> > +}
>> >> > +
>> >> >  /* Target-specific selftests.  */
>> >> >
>> >> >  #if CHECKING_P
>> >> > diff --git a/gcc/config/aarch64/aarch64.md 
>> >> > b/gcc/config/aarch64/aarch64.md
>> >> > index 9de6235b139..bdb93ccaf76 100644
>> >> > --- a/gcc/config/aarch64/aarch64.md
>> >> > +++ b/gcc/config/aarch64/aarch64.md
>> >> > @@ -4556,6 +4556,63 @@
>> >> >    [(set_attr "type" "crc")]
>> >> >  )
>> >> >
>> >> > +;; Reversed CRC
>> >> > +(define_expand "crc_rev<ALLI:mode><ALLX:mode>4"
>> >> > +  [;; return value (calculated CRC)
>> >> > +   (match_operand:ALLX 0 "register_operand" "=r")
>> >> > +   ;; initial CRC
>> >> > +   (match_operand:ALLX 1 "register_operand" "r")
>> >> > +   ;; data
>> >> > +   (match_operand:ALLI 2 "register_operand" "r")
>> >> > +   ;; polynomial without leading 1
>> >> > +   (match_operand:ALLX 3)]
>> >> > +  ""
>> >> > +  {
>> >> > +    /* If the polynomial is the same as the polynomial of crc32c* 
>> >> > instruction,
>> >> > +       put that instruction.  crc32c uses iSCSI polynomial.  */
>> >> > +    if (TARGET_CRC32 && INTVAL (operands[3]) == 0x1EDC6F41
>> >> > +     && <ALLX:MODE>mode == SImode)
>> >> > +      emit_insn (gen_aarch64_crc32c<ALLI:crc_data_type> (operands[0],
>> >> > +                                                      operands[1],
>> >> > +                                                      operands[2]));
>> >> > +    /* If the polynomial is the same as the polynomial of crc32* 
>> >> > instruction,
>> >> > +     put that instruction.  crc32 uses HDLC etc.  polynomial.  */
>> >> > +    else if (TARGET_CRC32 && INTVAL (operands[3]) == 0x04C11DB7
>> >> > +          && <ALLX:MODE>mode == SImode)
>> >> > +      emit_insn (gen_aarch64_crc32<ALLI:crc_data_type> (operands[0],
>> >> > +                                                     operands[1],
>> >> > +                                                     operands[2]));
>> >> > +    else if (TARGET_AES && <ALLI:sizen> <= <ALLX:sizen>)
>> >> > +      aarch64_expand_reversed_crc_using_pmull (<ALLX:MODE>mode,
>> >> > +                                            <ALLI:MODE>mode,
>> >> > +                                            operands);
>> >> > +    else
>> >> > +      /* Otherwise, generate table-based CRC.  */
>> >> > +      expand_reversed_crc_table_based (operands[0], operands[1], 
>> >> > operands[2],
>> >> > +                                    operands[3], <ALLI:MODE>mode,
>> >> > +                                    generate_reflecting_code_standard);
>> >> > +    DONE;
>> >> > +  }
>> >> > +)
>> >> > +
>> >> > +;; Bit-forward CRC
>> >> > +(define_expand "crc<ALLI:mode><ALLX:mode>4"
>> >> > +  [;; return value (calculated CRC)
>> >> > +   (match_operand:ALLX 0 "register_operand" "=r")
>> >> > +   ;; initial CRC
>> >> > +   (match_operand:ALLX 1 "register_operand" "r")
>> >> > +   ;; data
>> >> > +   (match_operand:ALLI 2 "register_operand" "r")
>> >> > +   ;; polynomial without leading 1
>> >> > +   (match_operand:ALLX 3)]
>> >> > +  "TARGET_AES && <ALLI:sizen> <= <ALLX:sizen>"
>> >> > +  {
>> >> > +    aarch64_expand_crc_using_pmull (<ALLX:MODE>mode, <ALLI:MODE>mode,
>> >> > +                                 operands);
>> >> > +    DONE;
>> >> > +  }
>> >> > +)
>> >> > +
>> >> >  (define_insn "*csinc2<mode>_insn"
>> >> >    [(set (match_operand:GPI 0 "register_operand" "=r")
>> >> >          (plus:GPI (match_operand 2 "aarch64_comparison_operation" "")
>> >> > diff --git a/gcc/config/aarch64/iterators.md 
>> >> > b/gcc/config/aarch64/iterators.md
>> >> > index f527b2cfeb8..8faba9025ce 100644
>> >> > --- a/gcc/config/aarch64/iterators.md
>> >> > +++ b/gcc/config/aarch64/iterators.md
>> >> > @@ -1276,6 +1276,10 @@
>> >> >  ;; Map a mode to a specific constraint character.
>> >> >  (define_mode_attr cmode [(QI "q") (HI "h") (SI "s") (DI "d")])
>> >> >
>> >> > +;; Map a mode to a specific constraint character for calling
>> >> > +;; appropriate version of crc.
>> >> > +(define_mode_attr crc_data_type [(QI "b") (HI "h") (SI "w") (DI "x")])
>> >> > +
>> >> >  ;; Map modes to Usg and Usj constraints for SISD right shifts
>> >> >  (define_mode_attr cmode_simd [(SI "g") (DI "j")])
>> >> >
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..4043251dbd8
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c
>> >> > @@ -0,0 +1,8 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-1.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..dd866b38e83
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-10.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..16d901eeaef
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-12.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..5f7741fad0f
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-13.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..cdedbbd3db1
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-14.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..c219e49a2b1
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-17.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..124900a979b
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-18.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..3cae1a7f57b
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-21.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..0ec2e312f8f
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-22.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..0c4542adb40
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-23.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..08f1d3b69d7
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-4.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..91bf5e6353d
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -w -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-5.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..4680eafe758
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-6.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..655484d10d4
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-7.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..6c2acc84c32
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-8.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..e76f3c77b59
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-9.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..21520474564
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-CCIT-data16.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..3dcc92320f3
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-CCIT-data8.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git 
>> >> > a/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c
>> >> > new file mode 100644
>> >> > index 00000000000..e5196aaafef
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c
>> >> > @@ -0,0 +1,9 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include "../../gcc.dg/torture/crc-coremark16-data16.c"
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */
>> >> > \ No newline at end of file
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c
>> >> > new file mode 100644
>> >> > index 00000000000..e82cb04fcc3
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c
>> >> > @@ -0,0 +1,53 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint16_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint16_t i = 0; i < 0xffff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c
>> >> > new file mode 100644
>> >> > index 00000000000..a7564a7e28a
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c
>> >> > @@ -0,0 +1,52 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 32; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint32_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 32; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint8_t i = 0; i < 0xff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c
>> >> > new file mode 100644
>> >> > index 00000000000..c88cafadedc
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c
>> >> > @@ -0,0 +1,53 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint8_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0xEDB88320;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint8_t i = 0; i < 0xff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c
>> >> > new file mode 100644
>> >> > index 00000000000..d82e6252603
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c
>> >> > @@ -0,0 +1,53 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint16_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint16_t i = 0; i < 0xffff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c
>> >> > new file mode 100644
>> >> > index 00000000000..7acb6fc239c
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c
>> >> > @@ -0,0 +1,52 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 32; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint32_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 32; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint8_t i = 0; i < 0xff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */
>> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c 
>> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c
>> >> > new file mode 100644
>> >> > index 00000000000..e8a8901e453
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c
>> >> > @@ -0,0 +1,53 @@
>> >> > +/* { dg-do run } */
>> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish 
>> >> > -fdump-tree-crc" } */
>> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */
>> >> > +
>> >> > +#include <stdint.h>
>> >> > +#include <stdlib.h>
>> >> > +
>> >> > +__attribute__ ((noinline,optimize(0)))
>> >> > +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +uint32_t _crc32 (uint32_t crc, uint8_t data) {
>> >> > +  int i;
>> >> > +  crc = crc ^ data;
>> >> > +
>> >> > +  for (i = 0; i < 8; i++) {
>> >> > +      if (crc & 1)
>> >> > +     crc = (crc >> 1) ^ 0x82F63B78;
>> >> > +      else
>> >> > +     crc = (crc >> 1);
>> >> > +    }
>> >> > +
>> >> > +  return crc;
>> >> > +}
>> >> > +
>> >> > +int main ()
>> >> > +{
>> >> > +  uint32_t crc = 0x0D800D80;
>> >> > +  for (uint8_t i = 0; i < 0xff; i++)
>> >> > +    {
>> >> > +      uint32_t res1 = _crc32_O0 (crc, i);
>> >> > +      uint32_t res2 = _crc32 (crc, i);
>> >> > +      if (res1 != res2)
>> >> > +      abort ();
>> >> > +      crc = res1;
>> >> > +    }
>> >> > +}
>> >> > +
>> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */
>> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC 
>> >> > code." 0 "crc"} } */
>> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */
>> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */

Reply via email to