On Sat, Aug 24, 2024 at 9:22 AM Mariam Arutunian <mariamarutun...@gmail.com> wrote: > > > > On Fri, Aug 23, 2024, 15:03 Richard Biener <richard.guent...@gmail.com> wrote: >> >> On Fri, Aug 23, 2024 at 9:55 AM Mariam Arutunian >> <mariamarutun...@gmail.com> wrote: >> > >> > >> > On Wed, Aug 21, 2024 at 5:56 PM Richard Sandiford >> > <richard.sandif...@arm.com> wrote: >> >> >> >> Mariam Arutunian <mariamarutun...@gmail.com> writes: >> >> > This patch introduces two new expanders for the aarch64 backend, >> >> > dedicated to generate optimized code for CRC computations. >> >> > The new expanders are designed to leverage specific hardware >> >> > capabilities >> >> > to achieve faster CRC calculations, >> >> > particularly using the crc32, crc32c and pmull instructions when >> >> > supported >> >> > by the target architecture. >> >> > >> >> > Expander 1: Bit-Forward CRC (crc<ALLI:mode><ALLX:mode>4) >> >> > For targets that support pmul instruction (TARGET_AES), >> >> > the expander will generate code that uses the pmull (crypto_pmulldi) >> >> > instruction for CRC computation. >> >> > >> >> > Expander 2: Bit-Reversed CRC (crc_rev<ALLI:mode><ALLX:mode>4) >> >> > The expander first checks if the target supports the CRC32* instruction >> >> > set >> >> > (TARGET_CRC32) >> >> > and the polynomial in use is 0x1EDC6F41 (iSCSI) or 0x04C11DB7 (HDLC). If >> >> > the conditions are met, >> >> > it emits calls to the corresponding crc32* instruction (depending on the >> >> > data size and the polynomial). >> >> > If the target does not support crc32* but supports pmull, it then uses >> >> > the >> >> > pmull (crypto_pmulldi) instruction for bit-reversed CRC computation. >> >> > Otherwise table-based CRC is generated. >> >> > >> >> > gcc/config/aarch64/ >> >> > >> >> > * aarch64-protos.h (aarch64_expand_crc_using_pmull): New extern >> >> > function declaration. >> >> > (aarch64_expand_reversed_crc_using_pmull): Likewise. >> >> > * aarch64.cc (aarch64_expand_crc_using_pmull): New function. >> >> > (aarch64_expand_reversed_crc_using_pmull): Likewise. >> >> > * aarch64.md (crc_rev<ALLI:mode><ALLX:mode>4): New expander for >> >> > reversed CRC. >> >> > (crc<ALLI:mode><ALLX:mode>4): New expander for bit-forward CRC. >> >> > * iterators.md (crc_data_type): New mode attribute. >> >> > >> >> > gcc/testsuite/gcc.target/aarch64/ >> >> > >> >> > * crc-1-pmul.c: New test. >> >> > * crc-10-pmul.c: Likewise. >> >> > * crc-12-pmul.c: Likewise. >> >> > * crc-13-pmul.c: Likewise. >> >> > * crc-14-pmul.c: Likewise. >> >> > * crc-17-pmul.c: Likewise. >> >> > * crc-18-pmul.c: Likewise. >> >> > * crc-21-pmul.c: Likewise. >> >> > * crc-22-pmul.c: Likewise. >> >> > * crc-23-pmul.c: Likewise. >> >> > * crc-4-pmul.c: Likewise. >> >> > * crc-5-pmul.c: Likewise. >> >> > * crc-6-pmul.c: Likewise. >> >> > * crc-7-pmul.c: Likewise. >> >> > * crc-8-pmul.c: Likewise. >> >> > * crc-9-pmul.c: Likewise. >> >> > * crc-CCIT-data16-pmul.c: Likewise. >> >> > * crc-CCIT-data8-pmul.c: Likewise. >> >> > * crc-coremark-16bitdata-pmul.c: Likewise. >> >> > * crc-crc32-data16.c: Likewise. >> >> > * crc-crc32-data32.c: Likewise. >> >> > * crc-crc32-data8.c: Likewise. >> >> > * crc-crc32c-data16.c: Likewise. >> >> > * crc-crc32c-data32.c: Likewise. >> >> > * crc-crc32c-data8.c: Likewise. >> >> >> >> OK for trunk once the prerequisites are approved. Thanks for all your >> >> work on this. >> >> >> >> Which other parts of the series still need review? I can try to help >> >> out with the target-independent bits. (That said, I'm not sure I'm the >> >> best person to review the tree recognition pass, but I can have a go.) >> >> >> > >> > Thank you very much for everything. >> > Right now, I'm not sure which parts would be best to be reviewed since >> > Richard Biener is currently reviewing them. >> > Maybe I can ask for your help later? >> >> I'm done with the parts I preserved for reviewing. Btw, it seems the >> vN series are not >> complete, that is, you didn't re-post the entire series but only >> changed parts? I was >> somewhat confused by that. > > > Yes, I didn't re-post the entire series; I only resent the parts that were > modified. I didn't know that I needed to send the entire series each time. > I'll make sure to do that in the next versions.
It's fine to only post changed parts, I just missed a note that you did this so was searching for a revised version of an older patch I had in the queue for reviewing. But re-posting the entire series is fine as well and probably the least confusing to everyone (including the pre-commit CI). Richard. > Thanks, > Mariam > > >> >> Richard. >> >> > Thanks, >> > Mariam >> > >> >> Richard >> >> >> >> > >> >> > Signed-off-by: Mariam Arutunian <mariamarutun...@gmail.com> >> >> > Co-authored-by: Richard Sandiford <richard.sandif...@arm.com> >> >> > diff --git a/gcc/config/aarch64/aarch64-protos.h >> >> > b/gcc/config/aarch64/aarch64-protos.h >> >> > index 42639e9efcf..469111e3b17 100644 >> >> > --- a/gcc/config/aarch64/aarch64-protos.h >> >> > +++ b/gcc/config/aarch64/aarch64-protos.h >> >> > @@ -1112,5 +1112,8 @@ extern void aarch64_adjust_reg_alloc_order (); >> >> > >> >> > bool aarch64_optimize_mode_switching (aarch64_mode_entity); >> >> > void aarch64_restore_za (rtx); >> >> > +void aarch64_expand_crc_using_pmull (scalar_mode, scalar_mode, rtx *); >> >> > +void aarch64_expand_reversed_crc_using_pmull (scalar_mode, >> >> > scalar_mode, rtx *); >> >> > + >> >> > >> >> > #endif /* GCC_AARCH64_PROTOS_H */ >> >> > diff --git a/gcc/config/aarch64/aarch64.cc >> >> > b/gcc/config/aarch64/aarch64.cc >> >> > index 7f0cc47d0f0..0cb8f3e8090 100644 >> >> > --- a/gcc/config/aarch64/aarch64.cc >> >> > +++ b/gcc/config/aarch64/aarch64.cc >> >> > @@ -30314,6 +30314,137 @@ aarch64_retrieve_sysreg (const char *regname, >> >> > bool write_p, bool is128op) >> >> > return sysreg->encoding; >> >> > } >> >> > >> >> > +/* Generate assembly to calculate CRC >> >> > + using carry-less multiplication instruction. >> >> > + OPERANDS[1] is input CRC, >> >> > + OPERANDS[2] is data (message), >> >> > + OPERANDS[3] is the polynomial without the leading 1. */ >> >> > + >> >> > +void >> >> > +aarch64_expand_crc_using_pmull (scalar_mode crc_mode, >> >> > + scalar_mode data_mode, >> >> > + rtx *operands) >> >> > +{ >> >> > + /* Check and keep arguments. */ >> >> > + gcc_assert (!CONST_INT_P (operands[0])); >> >> > + gcc_assert (CONST_INT_P (operands[3])); >> >> > + rtx crc = operands[1]; >> >> > + rtx data = operands[2]; >> >> > + rtx polynomial = operands[3]; >> >> > + >> >> > + unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode); >> >> > + unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode); >> >> > + gcc_assert (crc_size <= 32); >> >> > + gcc_assert (data_size <= crc_size); >> >> > + >> >> > + /* Calculate the quotient. */ >> >> > + unsigned HOST_WIDE_INT >> >> > + q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size); >> >> > + /* CRC calculation's main part. */ >> >> > + if (crc_size > data_size) >> >> > + crc = expand_shift (RSHIFT_EXPR, DImode, crc, crc_size - data_size, >> >> > + NULL_RTX, 1); >> >> > + >> >> > + rtx t0 = force_reg (DImode, gen_int_mode (q, DImode)); >> >> > + polynomial = simplify_gen_unary (ZERO_EXTEND, DImode, polynomial, >> >> > + GET_MODE (polynomial)); >> >> > + rtx t1 = force_reg (DImode, polynomial); >> >> > + >> >> > + rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1, >> >> > + OPTAB_WIDEN); >> >> > + >> >> > + rtx pmull_res = gen_reg_rtx (TImode); >> >> > + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0)); >> >> > + a0 = gen_lowpart (DImode, pmull_res); >> >> > + >> >> > + a0 = expand_shift (RSHIFT_EXPR, DImode, a0, crc_size, NULL_RTX, 1); >> >> > + >> >> > + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1)); >> >> > + a0 = gen_lowpart (DImode, pmull_res); >> >> > + >> >> > + if (crc_size > data_size) >> >> > + { >> >> > + rtx crc_part = expand_shift (LSHIFT_EXPR, DImode, operands[1], >> >> > data_size, >> >> > + NULL_RTX, 0); >> >> > + a0 = expand_binop (DImode, xor_optab, a0, crc_part, NULL_RTX, 1, >> >> > + OPTAB_DIRECT); >> >> > + } >> >> > + >> >> > + aarch64_emit_move (operands[0], gen_lowpart (crc_mode, a0)); >> >> > +} >> >> > + >> >> > +/* Generate assembly to calculate reversed CRC >> >> > + using carry-less multiplication instruction. >> >> > + OPERANDS[1] is input CRC, >> >> > + OPERANDS[2] is data, >> >> > + OPERANDS[3] is the polynomial without the leading 1. */ >> >> > + >> >> > +void >> >> > +aarch64_expand_reversed_crc_using_pmull (scalar_mode crc_mode, >> >> > + scalar_mode data_mode, >> >> > + rtx *operands) >> >> > +{ >> >> > + /* Check and keep arguments. */ >> >> > + gcc_assert (!CONST_INT_P (operands[0])); >> >> > + gcc_assert (CONST_INT_P (operands[3])); >> >> > + rtx crc = operands[1]; >> >> > + rtx data = operands[2]; >> >> > + rtx polynomial = operands[3]; >> >> > + >> >> > + unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode); >> >> > + unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode); >> >> > + gcc_assert (crc_size <= 32); >> >> > + gcc_assert (data_size <= crc_size); >> >> > + >> >> > + /* Calculate the quotient. */ >> >> > + unsigned HOST_WIDE_INT >> >> > + q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size); >> >> > + /* Reflect the calculated quotient. */ >> >> > + q = reflect_hwi (q, crc_size + 1); >> >> > + rtx t0 = force_reg (DImode, gen_int_mode (q, DImode)); >> >> > + >> >> > + /* Reflect the polynomial. */ >> >> > + unsigned HOST_WIDE_INT ref_polynomial = reflect_hwi (UINTVAL >> >> > (polynomial), >> >> > + crc_size); >> >> > + /* An unshifted multiplier would require the final result to be >> >> > extracted >> >> > + using a shift right by DATA_SIZE - 1 bits. Shift the multiplier >> >> > left >> >> > + so that the shift right can be by CRC_SIZE bits instead. */ >> >> > + ref_polynomial <<= crc_size - data_size + 1; >> >> > + rtx t1 = force_reg (DImode, gen_int_mode (ref_polynomial, DImode)); >> >> > + >> >> > + /* CRC calculation's main part. */ >> >> > + rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1, >> >> > + OPTAB_WIDEN); >> >> > + >> >> > + /* Perform carry-less multiplication and get low part. */ >> >> > + rtx pmull_res = gen_reg_rtx (TImode); >> >> > + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0)); >> >> > + a0 = gen_lowpart (DImode, pmull_res); >> >> > + >> >> > + a0 = expand_binop (DImode, and_optab, a0, >> >> > + gen_int_mode (GET_MODE_MASK (data_mode), DImode), >> >> > + NULL_RTX, 1, OPTAB_WIDEN); >> >> > + >> >> > + /* Perform carry-less multiplication. */ >> >> > + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1)); >> >> > + >> >> > + /* Perform a shift right by CRC_SIZE as an extraction of lane 1. */ >> >> > + machine_mode crc_vmode = aarch64_vq_mode (crc_mode).require (); >> >> > + a0 = (crc_size > data_size ? gen_reg_rtx (crc_mode) : operands[0]); >> >> > + emit_insn (gen_aarch64_get_lane (crc_vmode, a0, >> >> > + gen_lowpart (crc_vmode, pmull_res), >> >> > + aarch64_endian_lane_rtx (crc_vmode, >> >> > 1))); >> >> > + >> >> > + if (crc_size > data_size) >> >> > + { >> >> > + rtx crc_part = expand_shift (RSHIFT_EXPR, crc_mode, crc, >> >> > data_size, >> >> > + NULL_RTX, 1); >> >> > + a0 = expand_binop (crc_mode, xor_optab, a0, crc_part, >> >> > operands[0], 1, >> >> > + OPTAB_WIDEN); >> >> > + aarch64_emit_move (operands[0], a0); >> >> > + } >> >> > +} >> >> > + >> >> > /* Target-specific selftests. */ >> >> > >> >> > #if CHECKING_P >> >> > diff --git a/gcc/config/aarch64/aarch64.md >> >> > b/gcc/config/aarch64/aarch64.md >> >> > index 9de6235b139..bdb93ccaf76 100644 >> >> > --- a/gcc/config/aarch64/aarch64.md >> >> > +++ b/gcc/config/aarch64/aarch64.md >> >> > @@ -4556,6 +4556,63 @@ >> >> > [(set_attr "type" "crc")] >> >> > ) >> >> > >> >> > +;; Reversed CRC >> >> > +(define_expand "crc_rev<ALLI:mode><ALLX:mode>4" >> >> > + [;; return value (calculated CRC) >> >> > + (match_operand:ALLX 0 "register_operand" "=r") >> >> > + ;; initial CRC >> >> > + (match_operand:ALLX 1 "register_operand" "r") >> >> > + ;; data >> >> > + (match_operand:ALLI 2 "register_operand" "r") >> >> > + ;; polynomial without leading 1 >> >> > + (match_operand:ALLX 3)] >> >> > + "" >> >> > + { >> >> > + /* If the polynomial is the same as the polynomial of crc32c* >> >> > instruction, >> >> > + put that instruction. crc32c uses iSCSI polynomial. */ >> >> > + if (TARGET_CRC32 && INTVAL (operands[3]) == 0x1EDC6F41 >> >> > + && <ALLX:MODE>mode == SImode) >> >> > + emit_insn (gen_aarch64_crc32c<ALLI:crc_data_type> (operands[0], >> >> > + operands[1], >> >> > + operands[2])); >> >> > + /* If the polynomial is the same as the polynomial of crc32* >> >> > instruction, >> >> > + put that instruction. crc32 uses HDLC etc. polynomial. */ >> >> > + else if (TARGET_CRC32 && INTVAL (operands[3]) == 0x04C11DB7 >> >> > + && <ALLX:MODE>mode == SImode) >> >> > + emit_insn (gen_aarch64_crc32<ALLI:crc_data_type> (operands[0], >> >> > + operands[1], >> >> > + operands[2])); >> >> > + else if (TARGET_AES && <ALLI:sizen> <= <ALLX:sizen>) >> >> > + aarch64_expand_reversed_crc_using_pmull (<ALLX:MODE>mode, >> >> > + <ALLI:MODE>mode, >> >> > + operands); >> >> > + else >> >> > + /* Otherwise, generate table-based CRC. */ >> >> > + expand_reversed_crc_table_based (operands[0], operands[1], >> >> > operands[2], >> >> > + operands[3], <ALLI:MODE>mode, >> >> > + generate_reflecting_code_standard); >> >> > + DONE; >> >> > + } >> >> > +) >> >> > + >> >> > +;; Bit-forward CRC >> >> > +(define_expand "crc<ALLI:mode><ALLX:mode>4" >> >> > + [;; return value (calculated CRC) >> >> > + (match_operand:ALLX 0 "register_operand" "=r") >> >> > + ;; initial CRC >> >> > + (match_operand:ALLX 1 "register_operand" "r") >> >> > + ;; data >> >> > + (match_operand:ALLI 2 "register_operand" "r") >> >> > + ;; polynomial without leading 1 >> >> > + (match_operand:ALLX 3)] >> >> > + "TARGET_AES && <ALLI:sizen> <= <ALLX:sizen>" >> >> > + { >> >> > + aarch64_expand_crc_using_pmull (<ALLX:MODE>mode, <ALLI:MODE>mode, >> >> > + operands); >> >> > + DONE; >> >> > + } >> >> > +) >> >> > + >> >> > (define_insn "*csinc2<mode>_insn" >> >> > [(set (match_operand:GPI 0 "register_operand" "=r") >> >> > (plus:GPI (match_operand 2 "aarch64_comparison_operation" "") >> >> > diff --git a/gcc/config/aarch64/iterators.md >> >> > b/gcc/config/aarch64/iterators.md >> >> > index f527b2cfeb8..8faba9025ce 100644 >> >> > --- a/gcc/config/aarch64/iterators.md >> >> > +++ b/gcc/config/aarch64/iterators.md >> >> > @@ -1276,6 +1276,10 @@ >> >> > ;; Map a mode to a specific constraint character. >> >> > (define_mode_attr cmode [(QI "q") (HI "h") (SI "s") (DI "d")]) >> >> > >> >> > +;; Map a mode to a specific constraint character for calling >> >> > +;; appropriate version of crc. >> >> > +(define_mode_attr crc_data_type [(QI "b") (HI "h") (SI "w") (DI "x")]) >> >> > + >> >> > ;; Map modes to Usg and Usj constraints for SISD right shifts >> >> > (define_mode_attr cmode_simd [(SI "g") (DI "j")]) >> >> > >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..4043251dbd8 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c >> >> > @@ -0,0 +1,8 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-1.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..dd866b38e83 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-10.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..16d901eeaef >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-12.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..5f7741fad0f >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-13.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..cdedbbd3db1 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-14.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..c219e49a2b1 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-17.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..124900a979b >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-18.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..3cae1a7f57b >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-21.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..0ec2e312f8f >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-22.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..0c4542adb40 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-23.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..08f1d3b69d7 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-4.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..91bf5e6353d >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -w -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-5.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..4680eafe758 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-6.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..655484d10d4 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-7.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..6c2acc84c32 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-8.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..e76f3c77b59 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-9.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..21520474564 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-CCIT-data16.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..3dcc92320f3 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-CCIT-data8.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git >> >> > a/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c >> >> > new file mode 100644 >> >> > index 00000000000..e5196aaafef >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c >> >> > @@ -0,0 +1,9 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include "../../gcc.dg/torture/crc-coremark16-data16.c" >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ >> >> > \ No newline at end of file >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c >> >> > new file mode 100644 >> >> > index 00000000000..e82cb04fcc3 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c >> >> > @@ -0,0 +1,53 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > + >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint16_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint16_t i = 0; i < 0xffff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c >> >> > new file mode 100644 >> >> > index 00000000000..a7564a7e28a >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c >> >> > @@ -0,0 +1,52 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 32; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint32_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 32; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint8_t i = 0; i < 0xff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c >> >> > new file mode 100644 >> >> > index 00000000000..c88cafadedc >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c >> >> > @@ -0,0 +1,53 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > + >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint8_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0xEDB88320; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint8_t i = 0; i < 0xff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c >> >> > new file mode 100644 >> >> > index 00000000000..d82e6252603 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c >> >> > @@ -0,0 +1,53 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > + >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint16_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint16_t i = 0; i < 0xffff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c >> >> > new file mode 100644 >> >> > index 00000000000..7acb6fc239c >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c >> >> > @@ -0,0 +1,52 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 32; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint32_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 32; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint8_t i = 0; i < 0xff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ >> >> > diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c >> >> > b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c >> >> > new file mode 100644 >> >> > index 00000000000..e8a8901e453 >> >> > --- /dev/null >> >> > +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c >> >> > @@ -0,0 +1,53 @@ >> >> > +/* { dg-do run } */ >> >> > +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish >> >> > -fdump-tree-crc" } */ >> >> > +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ >> >> > + >> >> > +#include <stdint.h> >> >> > +#include <stdlib.h> >> >> > + >> >> > +__attribute__ ((noinline,optimize(0))) >> >> > +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +uint32_t _crc32 (uint32_t crc, uint8_t data) { >> >> > + int i; >> >> > + crc = crc ^ data; >> >> > + >> >> > + for (i = 0; i < 8; i++) { >> >> > + if (crc & 1) >> >> > + crc = (crc >> 1) ^ 0x82F63B78; >> >> > + else >> >> > + crc = (crc >> 1); >> >> > + } >> >> > + >> >> > + return crc; >> >> > +} >> >> > + >> >> > +int main () >> >> > +{ >> >> > + uint32_t crc = 0x0D800D80; >> >> > + for (uint8_t i = 0; i < 0xff; i++) >> >> > + { >> >> > + uint32_t res1 = _crc32_O0 (crc, i); >> >> > + uint32_t res2 = _crc32 (crc, i); >> >> > + if (res1 != res2) >> >> > + abort (); >> >> > + crc = res1; >> >> > + } >> >> > +} >> >> > + >> >> > +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ >> >> > +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC >> >> > code." 0 "crc"} } */ >> >> > +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ >> >> > +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */