On Wed, Sep 18, 2013 at 11:45 AM, Zhenqiang Chen <zhenqiang.c...@arm.com> wrote:
>
>> -----Original Message-----
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: Tuesday, August 27, 2013 8:18 PM
>> To: Richard Earnshaw
>> Cc: Zhenqiang Chen; GCC Patches
>> Subject: Re: [PATCH 1/n] Add conditional compare support
>>
>> On Tue, Aug 27, 2013 at 1:56 PM, Richard Earnshaw <rearn...@arm.com>
>> wrote:
>> > On 27/08/13 12:10, Richard Biener wrote:
>> >> What's this for and what's the desired semantics? I don't like having
>> >> extra tree codes for this.  Is this for a specific instruction set
>> >> feature?
>> >
>> > The background is to support the conditional compare instructions in
>> > ARM (more effectively) and AArch64 at all.
>> >
>> > The current method used in ARM is to expand into a series of
>> > store-flag instructions and then hope that combine can optimize them
>> > away (though that fails far too often, particularly when the first
>> > instruction in the sequence is combined into another pattern).  To
>> > make it work at all the compiler has to lie about the costs of various
>> > store-flag type operations which overall risks producing worse code
>> > and means we also have to support many more complex multi-instruction
>> > patterns than is desirable.  I really don't want to go down the same
> route
>> for AArch64.
>> >
>> > The idea behind all this is to capture potential conditional compare
>> > operations early enough in the mid end that we can keep track of them
>> > until RTL expand time and then to emit the correct logic on all
>> > targets depending on what is the best thing for that target.  The
>> > current method of lowering into store-flag sequences doesn't really cut
> it.
>>
>> It seems to me that then the initial instruction selection process (aka
> RTL
>> expansion) needs to be improved.  As we are expanding with having the CFG
>> around it should be easy enough to detect AND/ORIF cases and do better
>> here.  Yeah, I suppose this asks to turn existing jump expansion
> optimizations
>> up-side-down to optimize with the GIMPLE CFG in mind.
>>
>> The current way of LOGICAL_OP_NON_SHORT_CIRCUIT is certainly bogus -
>> fold-const.c is way too early to decide this.  Similar to the ongoing work
> of
>> expanding / building-up switch expressions in a GIMPLE pass, moving expand
>> complexity up the pipeline this asks for a GIMPLE phase that moves this
>> decision down closer to RTL expansion.
>> (We have tree-ssa-ifcombine.c that is a related GIMPLE transform pass)
>>
>
> The patch is updated according to your comments. It is a basic support,
> which does not touch ifcombine and jump related optimizations yet.
>
> Current method is:
> 1) In fold-const, if HAVE_conditional_compare, set
> LOGICAL_OP_NON_SHORT_CIRCUIT
>    to optimize_function_for_speed_p. So we do not depend on BRANCH_COST.
> 2) Identify CCMP during expanding. A CCMP candidate is a BIT_AND_EXPR
>    or BIT_IOR_EXPR, whose operators are compares.
> 3) Add a new op in optab to expand the CCMP to optimized RTL,
>     e.g. and_scc_scc/ior_scc_scc in ARM.
>
> Bootstrap on ARM Chrome book.
> No make check regression on Pandaboard.

That's better.  A few observations

+static bool
+is_conditional_compare_candidate_p (gimple g)
+{
+  tree rhs = gimple_assign_rhs_to_tree (g);
+  tree lhs, op0, op1;
+  gimple gs0, gs1;
+
+  if (TREE_CODE (rhs) != BIT_AND_EXPR && TREE_CODE (rhs) != BIT_IOR_EXPR)
+    return false;
+
+  lhs = gimple_assign_lhs (g);
+  op0 = TREE_OPERAND (rhs, 0);
+  op1 = TREE_OPERAND (rhs, 1);
+
+  if ((TREE_CODE (op0) != SSA_NAME) || (TREE_CODE (op1) != SSA_NAME)
+      || !has_single_use (lhs))
+    return false;
+
+  gs0 = SSA_NAME_DEF_STMT (op0);
+  gs1 = SSA_NAME_DEF_STMT (op1);

you are not allowed to use arbitrary definition stmts for code generation (you
may have a look at them though).  Use the get_gimple_for_ssa_name helper
to only access definition stmts that have no code generated yet and that
you can modify.

+  gimple gs0 = SSA_NAME_DEF_STMT (TREE_OPERAND (exp, 0));
+  gimple gs1 = SSA_NAME_DEF_STMT (TREE_OPERAND (exp, 1));

See above.

+  if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (gs0)))
+      && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (gs1))))
+    {
+      rtx op0, op1, op2, op3, tmp;
+      tree cmp0_op0, cmp0_op1, cmp1_op0, cmp1_op1;
+      enum machine_mode mode;
+
+      cmp0_op0 = gimple_assign_rhs1 (gs0);
+      cmp0_op1 = gimple_assign_rhs2 (gs0);
+      cmp1_op0 = gimple_assign_rhs1 (gs1);
+      cmp1_op1 = gimple_assign_rhs2 (gs1);
+
+      expand_operands (cmp0_op0, cmp0_op1, NULL_RTX, &op0, &op1,
EXPAND_NORMAL);
+      expand_operands (cmp1_op0, cmp1_op1, NULL_RTX, &op2, &op3,
EXPAND_NORMAL);

as you are not expanding the result of gs0 or gs1 you indeed have to
use get_gimple_for_ssa_name.

+         else
+           r = expand_expr_real (rhs, target, tmode, modifier, NULL);
+#else
+         r = expand_expr_real (rhs, target, tmode, modifier, NULL);
+#endif

just place the #else after the else, no need to duplicate the
expand_expr_real call.

+#ifdef HAVE_conditional_compare
+#undef LOGICAL_OP_NON_SHORT_CIRCUIT
+#define LOGICAL_OP_NON_SHORT_CIRCUIT optimize_function_for_speed_p (cfun)
+#else
 #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
 #define LOGICAL_OP_NON_SHORT_CIRCUIT \
   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
                false) >= 2)
 #endif
+#endif

that's of course a hack ;)  If we want to keep LOGICAL_OP_NON_SHORT_CIRCUIT
then it should become a target hook at this point.  But I'd argue that
at the level
of fold-const.c it should simply always be false.

@@ -187,6 +187,7 @@ OPTAB_D (movcc_optab, "mov$acc")
 OPTAB_D (cmov_optab, "cmov$a6")
 OPTAB_D (cstore_optab, "cstore$a4")
 OPTAB_D (ctrap_optab, "ctrap$a4")
+OPTAB_D (ccmp_optab, "conditional_compare")

I'd say the two should match, so ccmp_obtab, "ccmp"

You miss updates to md.texi for the new named patterns.

Richard.

> Thanks!
> -Zhenqiang
>
> ChangeLog:
> 2013-09-18  Zhenqiang Chen  <zhenqiang.c...@linaro.org>
>
>         * config/arm/arm.md (conditional_compare): New.
>         * expr.c (is_conditional_compare_candidate_p, expand_ccmp_expr):
> New.
>         (expand_expr_real_1): Identify conditional compare.
>         * fold-const.c (LOGICAL_OP_NON_SHORT_CIRCUIT): Update.
>         * optabs.c (expand_ccmp_op): New.
>         (get_rtx_code): Handle BIT_AND_EXPR and BIT_IOR_EXPR.
>         * optabs.def (ccmp_optab): New.
>         * optabs.h (expand_ccmp_op): New.

Reply via email to