Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

Richard Biener Wed, 14 Sep 2016 04:32:56 -0700

On Fri, Sep 2, 2016 at 10:09 AM, Kugan Vivekanandarajah
<kugan.vivekanandara...@linaro.org> wrote:
> Hi Richard,
>
> On 25 August 2016 at 22:24, Richard Biener <richard.guent...@gmail.com> wrote:
>> On Thu, Aug 11, 2016 at 1:09 AM, kugan
>> <kugan.vivekanandara...@linaro.org> wrote:
>>> Hi,
>>>
>>>
>>> On 10/08/16 20:28, Richard Biener wrote:
>>>>
>>>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek <ja...@redhat.com> wrote:
>>>>>
>>>>> On Wed, Aug 10, 2016 at 08:51:32AM +1000, kugan wrote:
>>>>>>
>>>>>> I see it now. The problem is we are just looking at (-1) being in the
>>>>>> ops
>>>>>> list for passing changed to rewrite_expr_tree in the case of
>>>>>> multiplication
>>>>>> by negate.  If we have combined (-1), as in the testcase, we will not
>>>>>> have
>>>>>> the (-1) and will pass changed=false to rewrite_expr_tree.
>>>>>>
>>>>>> We should set changed based on what happens in try_special_add_to_ops.
>>>>>> Attached patch does this. Bootstrap and regression testing are ongoing.
>>>>>> Is
>>>>>> this OK for trunk if there is no regression.
>>>>>
>>>>>
>>>>> I think the bug is elsewhere.  In particular in
>>>>> undistribute_ops_list/zero_one_operation/decrement_power.
>>>>> All those look problematic in this regard, they change RHS of statements
>>>>> to something that holds a different value, while keeping the LHS.
>>>>> So, generally you should instead just add a new stmt next to the old one,
>>>>> and adjust data structures (replace the old SSA_NAME in some ->op with
>>>>> the new one).  decrement_power might be a problem here, dunno if all the
>>>>> builtins are const in all cases that DSE would kill the old one,
>>>>> Richard, any preferences for that?  reset flow sensitive info + reset
>>>>> debug
>>>>> stmt uses, or something different?  Though, replacing the LHS with a new
>>>>> anonymous SSA_NAME might be needed too, in case it is before SSA_NAME of
>>>>> a
>>>>> user var that doesn't yet have any debug stmts.
>>>>
>>>>
>>>> I'd say replacing the LHS is the way to go, with calling the appropriate
>>>> helper
>>>> on the old stmt to generate a debug stmt for it / its uses (would need
>>>> to look it
>>>> up here).
>>>>
>>>
>>> Here is an attempt to fix it. The problem arises when in
>>> undistribute_ops_list, we linearize_expr_tree such that NEGATE_EXPR is added
>>> (-1) MULT_EXPR (OP). Real problem starts when we handle this in
>>> zero_one_operation. Unlike what was done earlier, we now change the stmt
>>> (with propagate_op_to_signle use or by directly) such that the value
>>> computed by stmt is no longer what it used to be. Because of this, what is
>>> computed in undistribute_ops_list and rewrite_expr_tree are also changed.
>>>
>>> undistribute_ops_list already expects this but rewrite_expr_tree will not if
>>> we dont pass the changed as an argument.
>>>
>>> The way I am fixing this now is, in linearize_expr_tree, I set ops_changed
>>> to true if we change NEGATE_EXPR to (-1) MULT_EXPR (OP). Then when we call
>>> zero_one_operation with ops_changed = true, I replace all the LHS in
>>> zero_one_operation with the new SSA and replace all the uses. I also call
>>> the rewrite_expr_tree with changed = false in this case.
>>>
>>> Does this make sense? Bootstrapped and regression tested for
>>> x86_64-linux-gnu without any new regressions.
>>
>> I don't think this solves the issue.  zero_one_operation associates the
>> chain starting at the first *def and it will change the intermediate values
>> of _all_ of the stmts visited until the operation to be removed is found.
>> Note that this is independent of whether try_special_add_to_ops did anything.
>>
>> Even for the regular undistribution cases we get this wrong.
>>
>> So we need to back-track in zero_one_operation, replacing each LHS
>> and in the end the op in the opvector of the main chain.  That's basically
>> the same as if we'd do a regular re-assoc operation on the sub-chains.
>> Take their subops, simulate zero_one_operation by
>> appending the cancelling operation and optimizing the oplist, and then
>> materializing the associated ops via rewrite_expr_tree.
>>
> Here is a draft patch which records the stmt chain when in
> zero_one_operation and then fixes it when OP is removed. when we
> update *def, that will update the ops vector. Does this looks sane?


Yes.  A few comments below

+  /* PR72835 - Record the stmt chain that has to be updated such that
+     we dont use the same LHS when the values computed are different.  */
+  auto_vec<gimple *> stmts_to_fix;

use auto_vec<gimple *, 64> here so we get stack allocation only most
of the times

          if (stmt_is_power_of_op (stmt, op))
            {
+             make_new_ssa_for_all_defs (def, op, stmts_to_fix);
              if (decrement_power (stmt) == 1)
                propagate_op_to_single_use (op, stmt, def);

for the cases you end up with propagate_op_to_single_use its argument
stmt is handled superfluosly in the new SSA making, I suggest to pop it
from the stmts_to_fix vector in that case.  I suggest to break; instead
of return in all cases and do the make_new_ssa_for_all_defs call at
the function end instead.

@@ -1253,14 +1305,18 @@ zero_one_operation (tree *def, enum tree_code
opcode, tree op)
              if (gimple_assign_rhs1 (stmt2) == op)
                {
                  tree cst = build_minus_one_cst (TREE_TYPE (op));
+                 stmts_to_fix.safe_push (stmt2);
+                 make_new_ssa_for_all_defs (def, op, stmts_to_fix);
                  propagate_op_to_single_use (cst, stmt2, def);
                  return;

this safe_push should be unnecessary for the above reason (others are
conditionally unnecessary).

I thought about simplifying the whole thing by instead of clearing an
op from the chain pre-pend
one that does the job by means of visiting the chain from reassoc
itself but that doesn't work out
for RDIV_EXPR nor does it play well with undistribute handling
mutliple opportunities on the same
chain.

Thanks,
Richard.


>
> Bootstrapped and regression tested on x86_64-linux-gnu with no new 
> regressions.
>
> Thanks,
> Kugan

Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

Reply via email to