On Mon, Feb 9, 2015 at 11:33 AM, Bin Cheng <bin.ch...@arm.com> wrote:
> The second time I missed patch in one day, I hate myself.
> Here it is.

I think the patch is reasonable but I would have used a default = NULL
arg for 'stop' to make the patch smaller.  You don't constrain 'stop'
to being an SSA name - any particular reason for that?  It would
make the comparison in expand_simple_operations simpler
and it could be extended to be a bitmap of SSA name versions.

So - I'd like you to constrain 'stop' and check it like

  if (TREE_CODE (expr) != SSA_NAME
     || expr == stop)
    return expr;

and declare

-extern tree expand_simple_operations (tree);
+extern tree expand_simple_operations (tree, tree = NULL_TREE);

Ok with that change.

Thanks,
Richard.

>> -----Original Message-----
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Bin Cheng
>> Sent: Monday, February 09, 2015 6:10 PM
>> To: gcc-patches@gcc.gnu.org
>> Subject: [PATCH PR64705]Don't aggressively expand induction variable's
> base
>>
>> Hi,
>> As comments in the PR, root cause is GCC aggressively expand induction
>> variable's base.  This patch avoids that by adding new parameter to
>> expand_simple_operations thus we can stop expansion whenever ssa var
>> referred by IV's step is encountered.  As comments in patch, we could stop
>> expanding at each ssa var referred by IV's step, but that's expensive and
> not
>> likely to happen, this patch only extracts the first ssa var and skips
> expanding
>> accordingly.
>> For the new test case, currently the loop body is bloated as below after
>> IVOPT:
>>
>> <bb 5>:
>>   # ci_28 = PHI <ci_12(D)(4), ci_17(6)>
>>   # ivtmp.13_31 = PHI <ivtmp.13_25(4), ivtmp.13_27(6)>
>>   ci_17 = ci_28 + 1;
>>   _1 = (void *) ivtmp.13_31;
>>   MEM[base: _1, offset: 0B] = 0;
>>   ivtmp.13_27 = ivtmp.13_31 + _26;
>>   _34 = (unsigned long) _13;
>>   _35 = (unsigned long) flags_8(D);
>>   _36 = _34 - _35;
>>   _37 = (unsigned long) step_14;
>>   _38 = _36 - _37;
>>   _39 = ivtmp.13_27 + _38;
>>   _40 = _39 + 3;
>>   iter_33 = (long int) _40;
>>   if (len_16(D) >= iter_33)
>>     goto <bb 6>;
>>   else
>>     goto <bb 7>;
>>
>> <bb 6>:
>>   goto <bb 5>;
>>
>> And it can be improved by this patch as below:
>>
>> <bb 5>:
>>   # steps_28 = PHI <steps_12(D)(4), steps_17(6)>
>>   # iter_29 = PHI <iter_15(4), iter_21(6)>
>>   steps_17 = steps_28 + 1;
>>   _31 = (sizetype) iter_29;
>>   MEM[base: flags_8(D), index: _31, offset: 0B] = 0;
>>   iter_21 = step_14 + iter_29;
>>   if (len_16(D) >= iter_21)
>>     goto <bb 6>;
>>   else
>>     goto <bb 7>;
>>
>> <bb 6>:
>>   goto <bb 5>;
>>
>>
>> I think this is a corner case, it only changes several files' assembly
> code
>> slightly in spec2k6.  Among these files, only one of them is regression.
> I
>> looked into the regression and thought it was because of passes after
> IVOPT.
>> The IVOPT dump is at least not worse than the original version.
>>
>> Bootstrap and test on x86_64 and AArch64, so is it OK?
>>
>> 2015-02-09  Bin Cheng  <bin.ch...@arm.com>
>>
>>       PR tree-optimization/64705
>>       * tree-ssa-loop-niter.h (expand_simple_operations): New
>> parameter.
>>       * tree-ssa-loop-niter.c (expand_simple_operations): New parameter.
>>       (tree_simplify_using_condition_1, refine_bounds_using_guard)
>>       (number_of_iterations_exit): Pass new argument to
>>       expand_simple_operations.
>>       * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New.
>>       (find_bivs, find_givs_in_stmt_scev): Pass new argument to
>>       expand_simple_operations.  Call extract_single_var_from_expr.
>>       (difference_cannot_overflow_p): Pass new argument to
>>       expand_simple_operations.
>>
>> gcc/testsuite/ChangeLog
>> 2015-02-09  Bin Cheng  <bin.ch...@arm.com>
>>
>>       PR tree-optimization/64705
>>       * gcc.dg/tree-ssa/pr64705.c: New test.
>>
>>
>>
>>

Reply via email to