Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> > +/* Check to see if the supplied comparison in PTEST can be performed as a
>>> > +   bit-test-and-branch instead.  VAL must contain the original tree
>>> > +   expression of the non-zero operand which will be used to rewrite the
>>> > +   comparison in PTEST.
>>> > +
>>> > +   Returns TRUE if operation succeeds and returns updated PMODE and
>>> PTEST,
>>> > +   else FALSE.  */
>>> > +
>>> > +enum insn_code
>>> > +static validate_test_and_branch (tree val, rtx *ptest, machine_mode
>>> > +*pmode) {
>>> > +  if (!val || TREE_CODE (val) != SSA_NAME)
>>> > +    return CODE_FOR_nothing;
>>> > +
>>> > +  machine_mode mode = TYPE_MODE (TREE_TYPE (val));  rtx test =
>>> > + *ptest;
>>> > +
>>> > +  if (GET_CODE (test) != EQ && GET_CODE (test) != NE)
>>> > +    return CODE_FOR_nothing;
>>> > +
>>> > +  /* If the target supports the testbit comparison directly, great.
>>> > + */  auto icode = direct_optab_handler (tbranch_optab, mode);  if
>>> > + (icode == CODE_FOR_nothing)
>>> > +    return icode;
>>> > +
>>> > +  if (tree_zero_one_valued_p (val))
>>> > +    {
>>> > +      auto pos = BYTES_BIG_ENDIAN ? GET_MODE_BITSIZE (mode) - 1 : 0;
>>> 
>>> Does this work for BYTES_BIG_ENDIAN && !WORDS_BIG_ENDIAN and mode
>>> > word_mode?
>>> 
>>
>> It does now. In this particular case all that matters is the bit ordering, 
>> so I've changed
>> It to BITS_BIG_ENDIAN.
>>
>> Also during the review of the AArch64 optab Richard Sandiford wanted me to 
>> split the
>> optabs apart into two.  The reason is that a match_operator still gets the 
>> full RTL.
>>
>> In the case of a tbranch the full RTL has an invalid comparison, so if a 
>> target doesn't implement
>> the hook correctly this would lead to incorrect code.  We've now moved the 
>> operator as part of
>> the name itself to avoid this.
>>
>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>>
>> Ok for master?
>>
>> Thanks,
>> Tamar
>>
>> gcc/ChangeLog:
>>
>>      * dojump.cc (do_jump): Pass along value.
>>      (do_jump_by_parts_greater_rtx): Likewise.
>>      (do_jump_by_parts_zero_rtx): Likewise.
>>      (do_jump_by_parts_equality_rtx): Likewise.
>>      (do_compare_rtx_and_jump): Likewise.
>>      (do_compare_and_jump): Likewise.
>>      * dojump.h (do_compare_rtx_and_jump): New.
>>      * optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
>>      (validate_test_and_branch): New.
>>      (emit_cmp_and_jump_insns): Optiobally take a value, and when value is
>>      supplied then check if it's suitable for tbranch.
>>      * optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
>>      * doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
>>      * optabs.h (emit_cmp_and_jump_insns):
>>      * tree.h (tree_zero_one_valued_p): New.
>
> Thanks for doing this.
>
>> --- inline copy of patch ---
>>
>> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
>> index 
>> d0a71ecbb806de3a6564c6ffe973fec5da5c597b..c6c4b13d756de28078a0a779876a00c614246914
>>  100644
>> --- a/gcc/doc/md.texi
>> +++ b/gcc/doc/md.texi
>> @@ -6964,6 +6964,14 @@ case, you can and should make operand 1's predicate 
>> reject some operators
>>  in the @samp{cstore@var{mode}4} pattern, or remove the pattern altogether
>>  from the machine description.
>>  
>> +@cindex @code{tbranch_@var{op}@var{mode}4} instruction pattern
>> +@item @samp{tbranch_@var{op}@var{mode}4}
>> +Conditional branch instruction combined with a bit test-and-compare
>> +instruction. Operand 0 is a comparison operator.  Operand 1 is the
>> +operand of the comparison. Operand 2 is the bit position of Operand 1 to 
>> test.
>> +Operand 3 is the @code{code_label} to jump to. @var{op} is one of @var{eq} 
>> or
>> +@var{ne}.
>> +
>
> The documentation still describes the old interface.  Also, there are only 3
> operands now, rather than 4, so the optab name should end with 3.
>
>>  @cindex @code{cbranch@var{mode}4} instruction pattern
>>  @item @samp{cbranch@var{mode}4}
>>  Conditional branch instruction combined with a compare instruction.
>> diff --git a/gcc/dojump.h b/gcc/dojump.h
>> index 
>> e379cceb34bb1765cb575636e4c05b61501fc2cf..d1d79c490c420a805fe48d58740a79c1f25fb839
>>  100644
>> --- a/gcc/dojump.h
>> +++ b/gcc/dojump.h
>> @@ -71,6 +71,10 @@ extern void jumpifnot (tree exp, rtx_code_label *label,
>>  extern void jumpifnot_1 (enum tree_code, tree, tree, rtx_code_label *,
>>                       profile_probability);
>>  
>> +extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int, tree,
>> +                                 machine_mode, rtx, rtx_code_label *,
>> +                                 rtx_code_label *, profile_probability);
>> +
>>  extern void do_compare_rtx_and_jump (rtx, rtx, enum rtx_code, int,
>>                                   machine_mode, rtx, rtx_code_label *,
>>                                   rtx_code_label *, profile_probability);
>> diff --git a/gcc/dojump.cc b/gcc/dojump.cc
>> index 
>> 2af0cd1aca3b6af13d5d8799094ee93f18022296..190324f36f1a31990f8c49bc8c0f45c23da5c31e
>>  100644
>> --- a/gcc/dojump.cc
>> +++ b/gcc/dojump.cc
>> @@ -619,7 +619,7 @@ do_jump (tree exp, rtx_code_label *if_false_label,
>>      }
>>        do_compare_rtx_and_jump (temp, CONST0_RTX (GET_MODE (temp)),
>>                             NE, TYPE_UNSIGNED (TREE_TYPE (exp)),
>> -                           GET_MODE (temp), NULL_RTX,
>> +                           exp, GET_MODE (temp), NULL_RTX,
>>                             if_false_label, if_true_label, prob);
>>      }
>>  
>> @@ -687,7 +687,7 @@ do_jump_by_parts_greater_rtx (scalar_int_mode mode, int 
>> unsignedp, rtx op0,
>>  
>>        /* All but high-order word must be compared as unsigned.  */
>>        do_compare_rtx_and_jump (op0_word, op1_word, code, (unsignedp || i > 
>> 0),
>> -                           word_mode, NULL_RTX, NULL, if_true_label,
>> +                           NULL, word_mode, NULL_RTX, NULL, if_true_label,
>>                             prob);
>>  
>>        /* Emit only one comparison for 0.  Do not emit the last cond jump.  
>> */
>> @@ -695,8 +695,8 @@ do_jump_by_parts_greater_rtx (scalar_int_mode mode, int 
>> unsignedp, rtx op0,
>>      break;
>>  
>>        /* Consider lower words only if these are equal.  */
>> -      do_compare_rtx_and_jump (op0_word, op1_word, NE, unsignedp, word_mode,
>> -                           NULL_RTX, NULL, if_false_label,
>> +      do_compare_rtx_and_jump (op0_word, op1_word, NE, unsignedp, NULL,
>> +                           word_mode, NULL_RTX, NULL, if_false_label,
>>                             prob.invert ());
>>      }
>>  
>> @@ -755,7 +755,7 @@ do_jump_by_parts_zero_rtx (scalar_int_mode mode, rtx op0,
>>  
>>    if (part != 0)
>>      {
>> -      do_compare_rtx_and_jump (part, const0_rtx, EQ, 1, word_mode,
>> +      do_compare_rtx_and_jump (part, const0_rtx, EQ, 1, NULL, word_mode,
>>                             NULL_RTX, if_false_label, if_true_label, prob);
>>        return;
>>      }
>> @@ -766,7 +766,7 @@ do_jump_by_parts_zero_rtx (scalar_int_mode mode, rtx op0,
>>  
>>    for (i = 0; i < nwords; i++)
>>      do_compare_rtx_and_jump (operand_subword_force (op0, i, mode),
>> -                             const0_rtx, EQ, 1, word_mode, NULL_RTX,
>> +                         const0_rtx, EQ, 1, NULL, word_mode, NULL_RTX,
>>                           if_false_label, NULL, prob);
>>  
>>    if (if_true_label)
>> @@ -809,8 +809,8 @@ do_jump_by_parts_equality_rtx (scalar_int_mode mode, rtx 
>> op0, rtx op1,
>>  
>>    for (i = 0; i < nwords; i++)
>>      do_compare_rtx_and_jump (operand_subword_force (op0, i, mode),
>> -                             operand_subword_force (op1, i, mode),
>> -                             EQ, 0, word_mode, NULL_RTX,
>> +                         operand_subword_force (op1, i, mode),
>> +                         EQ, 0, NULL, word_mode, NULL_RTX,
>>                           if_false_label, NULL, prob);
>>  
>>    if (if_true_label)
>> @@ -962,6 +962,23 @@ do_compare_rtx_and_jump (rtx op0, rtx op1, enum 
>> rtx_code code, int unsignedp,
>>                       rtx_code_label *if_false_label,
>>                       rtx_code_label *if_true_label,
>>                       profile_probability prob)
>> +{
>> +  do_compare_rtx_and_jump (op0, op1, code, unsignedp, NULL, mode, size,
>> +                      if_false_label, if_true_label, prob);
>> +}
>> +
>> +/* Like do_compare_and_jump but expects the values to compare as two rtx's.
>> +   The decision as to signed or unsigned comparison must be made by the 
>> caller.
>> +
>> +   If MODE is BLKmode, SIZE is an RTX giving the size of the objects being
>> +   compared.  */
>> +
>> +void
>> +do_compare_rtx_and_jump (rtx op0, rtx op1, enum rtx_code code, int 
>> unsignedp,
>> +                     tree val, machine_mode mode, rtx size,
>> +                     rtx_code_label *if_false_label,
>> +                     rtx_code_label *if_true_label,
>> +                     profile_probability prob)
>>  {
>>    rtx tem;
>>    rtx_code_label *dummy_label = NULL;
>> @@ -1177,8 +1194,10 @@ do_compare_rtx_and_jump (rtx op0, rtx op1, enum 
>> rtx_code code, int unsignedp,
>>                  }
>>                else
>>                  dest_label = if_false_label;
>> -                  do_compare_rtx_and_jump (op0, op1, first_code, unsignedp, 
>> mode,
>> -                                       size, dest_label, NULL, first_prob);
>> +
>> +              do_compare_rtx_and_jump (op0, op1, first_code, unsignedp,
>> +                                       val, mode, size, dest_label, NULL,
>> +                                       first_prob);
>>              }
>>            /* For !and_them we want to split:
>>               if (x) goto t; // prob;
>> @@ -1192,8 +1211,9 @@ do_compare_rtx_and_jump (rtx op0, rtx op1, enum 
>> rtx_code code, int unsignedp,
>>                else
>>              {
>>                profile_probability first_prob = prob.split (cprob);
>> -              do_compare_rtx_and_jump (op0, op1, first_code, unsignedp, 
>> mode,
>> -                                       size, NULL, if_true_label, 
>> first_prob);
>> +              do_compare_rtx_and_jump (op0, op1, first_code, unsignedp,
>> +                                       val, mode, size, NULL,
>> +                                       if_true_label, first_prob);
>>                if (orig_code == NE && can_compare_p (UNEQ, mode, ccp_jump))
>>                  {
>>                    /* x != y can be split into x unord y || x ltgt y
>> @@ -1215,7 +1235,7 @@ do_compare_rtx_and_jump (rtx op0, rtx op1, enum 
>> rtx_code code, int unsignedp,
>>          }
>>      }
>>  
>> -      emit_cmp_and_jump_insns (op0, op1, code, size, mode, unsignedp,
>> +      emit_cmp_and_jump_insns (op0, op1, code, size, mode, unsignedp, val,
>>                             if_true_label, prob);
>>      }
>>  
>> @@ -1289,9 +1309,9 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum 
>> rtx_code signed_code,
>>        op1 = new_op1;
>>      }
>>  
>> -  do_compare_rtx_and_jump (op0, op1, code, unsignedp, mode,
>> -                           ((mode == BLKmode)
>> -                            ? expr_size (treeop0) : NULL_RTX),
>> +  do_compare_rtx_and_jump (op0, op1, code, unsignedp, treeop0, mode,
>> +                       ((mode == BLKmode)
>> +                        ? expr_size (treeop0) : NULL_RTX),
>>                         if_false_label, if_true_label, prob);
>>  }
>>  
>> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
>> index 
>> 31b15fd3df5fa88119867a23d2abbed139a05115..303b4fd2def9278ddbc3d586103ac8274e73a982
>>  100644
>> --- a/gcc/optabs.cc
>> +++ b/gcc/optabs.cc
>> @@ -46,6 +46,8 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "libfuncs.h"
>>  #include "internal-fn.h"
>>  #include "langhooks.h"
>> +#include "gimple.h"
>> +#include "ssa.h"
>>  
>>  static void prepare_float_lib_cmp (rtx, rtx, enum rtx_code, rtx *,
>>                                 machine_mode *);
>> @@ -4623,7 +4625,8 @@ prepare_operand (enum insn_code icode, rtx x, int 
>> opnum, machine_mode mode,
>>  
>>  static void
>>  emit_cmp_and_jump_insn_1 (rtx test, machine_mode mode, rtx label,
>> -                      profile_probability prob)
>> +                      direct_optab cmp_optab, profile_probability prob,
>> +                      bool test_branch)
>>  {
>>    machine_mode optab_mode;
>>    enum mode_class mclass;
>> @@ -4632,12 +4635,17 @@ emit_cmp_and_jump_insn_1 (rtx test, machine_mode 
>> mode, rtx label,
>>  
>>    mclass = GET_MODE_CLASS (mode);
>>    optab_mode = (mclass == MODE_CC) ? CCmode : mode;
>> -  icode = optab_handler (cbranch_optab, optab_mode);
>> +  icode = optab_handler (cmp_optab, optab_mode);
>>  
>>    gcc_assert (icode != CODE_FOR_nothing);
>> -  gcc_assert (insn_operand_matches (icode, 0, test));
>> -  insn = emit_jump_insn (GEN_FCN (icode) (test, XEXP (test, 0),
>> -                                          XEXP (test, 1), label));
>> +  gcc_assert (test_branch || insn_operand_matches (icode, 0, test));
>> +  if (test_branch)
>> +    insn = emit_jump_insn (GEN_FCN (icode) (XEXP (test, 0),
>> +                                        XEXP (test, 1), label));
>> +  else
>> +    insn = emit_jump_insn (GEN_FCN (icode) (test, XEXP (test, 0),
>> +                                        XEXP (test, 1), label));
>> +
>>    if (prob.initialized_p ()
>>        && profile_status_for_fn (cfun) != PROFILE_ABSENT
>>        && insn
>> @@ -4647,6 +4655,63 @@ emit_cmp_and_jump_insn_1 (rtx test, machine_mode 
>> mode, rtx label,
>>      add_reg_br_prob_note (insn, prob);
>>  }
>>  
>> +/* Check to see if the supplied comparison in PTEST can be performed as a
>> +   bit-test-and-branch instead.  VAL must contain the original tree
>> +   expression of the non-zero operand which will be used to rewrite the
>> +   comparison in PTEST.
>> +
>> +   Returns TRUE if operation succeeds and returns updated PMODE and PTEST,
>> +   else FALSE.  */
>
> The function now returns an icode rather than true/false.  I think it'd
> also be good to clarify what *PTEST means for the tbranch case.  How about:
>
> /* PTEST points to a comparison that compares its first operand with zero.
>    Check to see if it can be performed as a bit-test-and-branch instead.
>    On success, return the instruction that performs the 
> bit-and-test-and-branch

(bit-test-and-branch)

>    and replace the second operand of *PTEST with the bit number to test.
>    On failure, return CODE_FOR_nothing and leave *PTEST unchanged.
>
>    Note that the comparison described by *PTEST should not be taken
>    literally after a successful return.  *PTEST is just a convenient
>    place to store the two operands of the bit-and-test.
>
>    VAL must contain the original tree expression for the first operand
>    of *PTEST.  */
>
> Looks good to me otherwise.
>
> Thanks,
> Richard
>
>> +static enum insn_code
>> +validate_test_and_branch (tree val, rtx *ptest, machine_mode *pmode, optab 
>> *res)
>> +{
>> +  if (!val || TREE_CODE (val) != SSA_NAME)
>> +    return CODE_FOR_nothing;
>> +
>> +  machine_mode mode = TYPE_MODE (TREE_TYPE (val));
>> +  rtx test = *ptest;
>> +  direct_optab optab;
>> +
>> +  if (GET_CODE (test) == EQ)
>> +    optab = tbranch_eq_optab;
>> +  else if (GET_CODE (test) == NE)
>> +    optab = tbranch_ne_optab;
>> +  else
>> +    return CODE_FOR_nothing;
>> +
>> +  *res = optab;
>> +
>> +  /* If the target supports the testbit comparison directly, great.  */
>> +  auto icode = direct_optab_handler (optab, mode);
>> +  if (icode == CODE_FOR_nothing)
>> +    return icode;
>> +
>> +  if (tree_zero_one_valued_p (val))
>> +    {
>> +      auto pos = BITS_BIG_ENDIAN ? GET_MODE_BITSIZE (mode) - 1 : 0;
>> +      XEXP (test, 1) = gen_int_mode (pos, mode);
>> +      *ptest = test;
>> +      *pmode = mode;
>> +      return icode;
>> +    }
>> +
>> +  wide_int wcst = get_nonzero_bits (val);
>> +  if (wcst == -1)
>> +    return CODE_FOR_nothing;
>> +
>> +  int bitpos;
>> +
>> +  if ((bitpos = wi::exact_log2 (wcst)) == -1)
>> +    return CODE_FOR_nothing;
>> +
>> +  auto pos = BITS_BIG_ENDIAN ? GET_MODE_BITSIZE (mode) - 1 - bitpos : 
>> bitpos;
>> +  XEXP (test, 1) = gen_int_mode (pos, mode);
>> +  *ptest = test;
>> +  *pmode = mode;
>> +  return icode;
>> +}
>> +
>>  /* Generate code to compare X with Y so that the condition codes are
>>     set and to jump to LABEL if the condition is true.  If X is a
>>     constant and Y is not a constant, then the comparison is swapped to
>> @@ -4664,11 +4729,13 @@ emit_cmp_and_jump_insn_1 (rtx test, machine_mode 
>> mode, rtx label,
>>     It will be potentially converted into an unsigned variant based on
>>     UNSIGNEDP to select a proper jump instruction.
>>     
>> -   PROB is the probability of jumping to LABEL.  */
>> +   PROB is the probability of jumping to LABEL.  If the comparison is 
>> against
>> +   zero then VAL contains the expression from which the non-zero RTL is
>> +   derived.  */
>>  
>>  void
>>  emit_cmp_and_jump_insns (rtx x, rtx y, enum rtx_code comparison, rtx size,
>> -                     machine_mode mode, int unsignedp, rtx label,
>> +                     machine_mode mode, int unsignedp, tree val, rtx label,
>>                           profile_probability prob)
>>  {
>>    rtx op0 = x, op1 = y;
>> @@ -4693,10 +4760,34 @@ emit_cmp_and_jump_insns (rtx x, rtx y, enum rtx_code 
>> comparison, rtx size,
>>  
>>    prepare_cmp_insn (op0, op1, comparison, size, unsignedp, OPTAB_LIB_WIDEN,
>>                  &test, &mode);
>> -  emit_cmp_and_jump_insn_1 (test, mode, label, prob);
>> +
>> +  /* Check if we're comparing a truth type with 0, and if so check if
>> +     the target supports tbranch.  */
>> +  machine_mode tmode = mode;
>> +  direct_optab optab;
>> +  if (op1 == CONST0_RTX (GET_MODE (op1))
>> +      && validate_test_and_branch (val, &test, &tmode,
>> +                               &optab) != CODE_FOR_nothing)
>> +    {
>> +      emit_cmp_and_jump_insn_1 (test, tmode, label, optab, prob, true);
>> +      return;
>> +    }
>> +
>> +  emit_cmp_and_jump_insn_1 (test, mode, label, cbranch_optab, prob, false);
>>  }
>>  
>> -
>>
>> +/* Overloaded version of emit_cmp_and_jump_insns in which VAL is unknown.  
>> */
>> +
>> +void
>> +emit_cmp_and_jump_insns (rtx x, rtx y, enum rtx_code comparison, rtx size,
>> +                     machine_mode mode, int unsignedp, rtx label,
>> +                     profile_probability prob)
>> +{
>> +  emit_cmp_and_jump_insns (x, y, comparison, size, mode, unsignedp, NULL,
>> +                       label, prob);
>> +}
>> +
>> +
>>  /* Emit a library call comparison between floating point X and Y.
>>     COMPARISON is the rtl operator to compare with (EQ, NE, GT, etc.).  */
>>  
>> diff --git a/gcc/optabs.def b/gcc/optabs.def
>> index 
>> a6db2342bed6baf13ecbd84112c8432c6972e6fe..3199b05e90d6b9b9c6fb3c0353db3db02321e964
>>  100644
>> --- a/gcc/optabs.def
>> +++ b/gcc/optabs.def
>> @@ -220,6 +220,8 @@ OPTAB_D (reload_in_optab, "reload_in$a")
>>  OPTAB_D (reload_out_optab, "reload_out$a")
>>  
>>  OPTAB_DC(cbranch_optab, "cbranch$a4", COMPARE)
>> +OPTAB_D (tbranch_eq_optab, "tbranch_eq$a4")
>> +OPTAB_D (tbranch_ne_optab, "tbranch_ne$a4")
>>  OPTAB_D (addcc_optab, "add$acc")
>>  OPTAB_D (negcc_optab, "neg$acc")
>>  OPTAB_D (notcc_optab, "not$acc")
>> diff --git a/gcc/optabs.h b/gcc/optabs.h
>> index 
>> cfd7c742d2d21b0539f5227c22a94f32c793d6f7..cd55604bc3d452d7e28c5530bb4793d481766f4f
>>  100644
>> --- a/gcc/optabs.h
>> +++ b/gcc/optabs.h
>> @@ -268,6 +268,10 @@ extern void emit_cmp_and_jump_insns (rtx, rtx, enum 
>> rtx_code, rtx,
>>                                   machine_mode, int, rtx,
>>                                   profile_probability prob
>>                                      = profile_probability::uninitialized 
>> ());
>> +extern void emit_cmp_and_jump_insns (rtx, rtx, enum rtx_code, rtx,
>> +                                 machine_mode, int, tree, rtx,
>> +                                 profile_probability prob
>> +                                    = profile_probability::uninitialized 
>> ());
>>  
>>  /* Generate code to indirectly jump to a location given in the rtx LOC.  */
>>  extern void emit_indirect_jump (rtx);
>> diff --git a/gcc/tree.h b/gcc/tree.h
>> index 
>> a863d2e50e5ecafa3f5da4dda98d9637261d07a9..abedaa80a3983ebb6f9ac733b2eaa8d039688f0a
>>  100644
>> --- a/gcc/tree.h
>> +++ b/gcc/tree.h
>> @@ -4726,6 +4726,7 @@ extern tree signed_or_unsigned_type_for (int, tree);
>>  extern tree signed_type_for (tree);
>>  extern tree unsigned_type_for (tree);
>>  extern bool is_truth_type_for (tree, tree);
>> +extern bool tree_zero_one_valued_p (tree);
>>  extern tree truth_type_for (tree);
>>  extern tree build_pointer_type_for_mode (tree, machine_mode, bool);
>>  extern tree build_pointer_type (tree);

Reply via email to