Hi, Gently ping the series of patches. [PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.html [PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650218.html [PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650219.html [PATCH-4, rs6000] Optimize single cc bit reverse implementation https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650220.html [PATCH-5, rs6000] Replace explicit CC bit reverse with common format https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html [PATCH-6, rs6000] Split setcc to two insns after reload https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html
Thanks Gui Haochen 在 2024/4/30 15:18, HAO CHEN GUI 写道: > Hi, > It's the first patch of a series of patches optimizing CC modes on > rs6000. > > bcd insns set all four bits of a CR field. But it has different single > bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used > to indict overflow or invalid number. It's not a bit for unordered test. > So the "le" test should be reversed to "gt" not "ungt". The "ge" test > should be reversed to "lt" not "unlt". That's the root cause of PR100736 > and PR114732. > > This patch fixes the issue by adding a new type of CC mode - CCBCD for > all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will > be merged to a uniform pattern which is for all CC modes in sequential > patch. > > The rtl code "unordered" is still used for testing overflow or > invalid number. IMHO, the "unordered" on a CC mode can be considered as > testing the forth bit of a CR field setting or not. The "eq" on a CC mode > can be considered as testing the third bit setting or not. Thus we avoid > creating lots of unspecs for the CR bit testing. > > Bootstrapped and tested on powerpc64-linux BE and LE with no > regressions. Is it OK for the trunk? > > Thanks > Gui Haochen > > > ChangeLog > rs6000: Add a new type of CC mode - CCBCD for bcd insns > > gcc/ > PR target/100736 > PR target/114732 > * config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP > with CCBCD. > (*bcd<bcd_add_sub>_test_<mode>): Likewise. > (*bcd<bcd_add_sub>_test2_<mode>): Likewise. > (bcd<bcd_add_sub>_<code>_<mode>): Likewise. > (*bcdinvalid_<mode>): Likewise. > (bcdinvalid_<mode>): Likewise. > (bcdshift_v16qi): Likewise. > (bcdmul10_v16qi): Likewise. > (bcddiv10_v16qi): Likewise. > (peephole for bcd_add/sub): Likewise. > * config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD > and its supported comparison codes. > * config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD. > * config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD > assertion. > * config/rs6000/rs6000.md (CC_any): Add CCBCD. > (ccbcd_rev): New code iterator. > (*<code><mode>_cc): New insn and split pattern for CCBCD reverse > compare. > > gcc/testsuite/ > PR target/100736 > PR target/114732 > * gcc.target/powerpc/pr100736.c: New. > * gcc.target/powerpc/pr114732.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index bb20441c096..9fa8cf89f61 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcd<bcd_add_sub>. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > ;; probably should be one that can go in the VMX (Altivec) registers, so we > ;; can't use DDmode or DFmode. > (define_insn "*bcd<bcd_add_sub>_test_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4566,8 +4566,8 @@ (define_insn "vclrrb" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand") > (match_operand:VBCD 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > (match_dup 4))) > (clobber (match_scratch:VBCD 5))]) > (set (match_operand:SI 0 "register_operand") > - (BCD_TEST:SI (reg:CCFP CR6_REGNO) > + (BCD_TEST:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > }) > > (define_insn "*bcdinvalid_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] > UNSPEC_BCDSUB) > (match_operand:V2DF 2 "zero_constant" "j"))) > @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcdinvalid_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] > UNSPEC_BCDSUB) > (match_dup 2))) > (clobber (match_scratch:VBCD 3))]) > (set (match_operand:SI 0 "register_operand") > - (unordered:SI (reg:CCFP CR6_REGNO) > + (unordered:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi" > (match_operand:V16QI 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcds. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4662,9 +4662,9 @@ (define_peephole2 > (match_operand:V1TI 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))]) > - (parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (clobber (reg:CCBCD CR6_REGNO))]) > + (parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4677,8 +4677,8 @@ (define_peephole2 > (match_dup 2) > (match_dup 3)] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index d23ce9a77a3..18198add744 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator" > (if_then_else (match_test "flag_finite_math_only") > (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) > - (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) > + (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode") > + (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > + (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))) > (match_test "validate_condition_mode (GET_CODE (op), > GET_MODE (XEXP (op, 0))), > 1"))) > diff --git a/gcc/config/rs6000/rs6000-modes.def > b/gcc/config/rs6000/rs6000-modes.def > index 094b246c834..3e2e6dfb4ff 100644 > --- a/gcc/config/rs6000/rs6000-modes.def > +++ b/gcc/config/rs6000/rs6000-modes.def > @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, > ieee_quad_format); > > CC_MODE (CCUNS); > CC_MODE (CCFP); > +CC_MODE (CCBCD); /* Used for bcd insns */ > CC_MODE (CCEQ); > > /* Vector modes. */ > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 6ba9df4f02e..4068cd8b929 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, > machine_mode mode) > gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU) > || mode == CCUNSmode); > > + gcc_assert (mode == CCFPmode || mode == CCBCDmode > + || (code != ORDERED && code != UNORDERED)); > + > gcc_assert (mode == CCFPmode > - || (code != ORDERED && code != UNORDERED > - && code != UNEQ && code != LTGT > + || (code != UNEQ && code != LTGT > && code != UNGT && code != UNLT > && code != UNGE && code != UNLE)); > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bc8bc6ab060..9b5fcdc8db0 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -8115,7 +8115,7 @@ (define_expand "movcc" > "" > "") > > -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP]) > +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD]) > > (define_insn "*movcc_<mode>" > [(set (match_operand:CC_any 0 "nonimmediate_operand" > @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>" > > (define_code_iterator fp_rev [ordered ne unle unge]) > (define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) > +(define_code_iterator ccbcd_rev [ordered ne le ge]) > > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc" > } > [(set_attr "length" "12")]) > > +(define_insn_and_split "*<code><mode>_cc" > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > + (ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y") > + (const_int 0)))] > + "" > + "#" > + "&& 1" > + [(pc)] > +{ > + rtx_code revcode = reverse_condition (<CODE>); > + rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx); > + rtx tmp = gen_reg_rtx (<MODE>mode); > + emit_move_insn (tmp, eq); > + emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx)); > + DONE; > +} > + [(set_attr "length" "12")]) > + > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c > b/gcc/testsuite/gcc.target/powerpc/pr100736.c > new file mode 100644 > index 00000000000..85e3ae56d55 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > + > +/* Verify there is no ICE with finite-math-only */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0); > +} > + > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c > b/gcc/testsuite/gcc.target/powerpc/pr114732.c > new file mode 100644 > index 00000000000..d0b878780c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > + > +/* Verify only one cr bit need to be tested. */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0) != 1; > +} > + > +/* { dg-final { scan-assembler-not "cror" } } */