On Mon, Feb 9, 2015 at 11:33 AM, Bin Cheng <bin.ch...@arm.com> wrote: > The second time I missed patch in one day, I hate myself. > Here it is.
I think the patch is reasonable but I would have used a default = NULL arg for 'stop' to make the patch smaller. You don't constrain 'stop' to being an SSA name - any particular reason for that? It would make the comparison in expand_simple_operations simpler and it could be extended to be a bitmap of SSA name versions. So - I'd like you to constrain 'stop' and check it like if (TREE_CODE (expr) != SSA_NAME || expr == stop) return expr; and declare -extern tree expand_simple_operations (tree); +extern tree expand_simple_operations (tree, tree = NULL_TREE); Ok with that change. Thanks, Richard. >> -----Original Message----- >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- >> ow...@gcc.gnu.org] On Behalf Of Bin Cheng >> Sent: Monday, February 09, 2015 6:10 PM >> To: gcc-patches@gcc.gnu.org >> Subject: [PATCH PR64705]Don't aggressively expand induction variable's > base >> >> Hi, >> As comments in the PR, root cause is GCC aggressively expand induction >> variable's base. This patch avoids that by adding new parameter to >> expand_simple_operations thus we can stop expansion whenever ssa var >> referred by IV's step is encountered. As comments in patch, we could stop >> expanding at each ssa var referred by IV's step, but that's expensive and > not >> likely to happen, this patch only extracts the first ssa var and skips > expanding >> accordingly. >> For the new test case, currently the loop body is bloated as below after >> IVOPT: >> >> <bb 5>: >> # ci_28 = PHI <ci_12(D)(4), ci_17(6)> >> # ivtmp.13_31 = PHI <ivtmp.13_25(4), ivtmp.13_27(6)> >> ci_17 = ci_28 + 1; >> _1 = (void *) ivtmp.13_31; >> MEM[base: _1, offset: 0B] = 0; >> ivtmp.13_27 = ivtmp.13_31 + _26; >> _34 = (unsigned long) _13; >> _35 = (unsigned long) flags_8(D); >> _36 = _34 - _35; >> _37 = (unsigned long) step_14; >> _38 = _36 - _37; >> _39 = ivtmp.13_27 + _38; >> _40 = _39 + 3; >> iter_33 = (long int) _40; >> if (len_16(D) >= iter_33) >> goto <bb 6>; >> else >> goto <bb 7>; >> >> <bb 6>: >> goto <bb 5>; >> >> And it can be improved by this patch as below: >> >> <bb 5>: >> # steps_28 = PHI <steps_12(D)(4), steps_17(6)> >> # iter_29 = PHI <iter_15(4), iter_21(6)> >> steps_17 = steps_28 + 1; >> _31 = (sizetype) iter_29; >> MEM[base: flags_8(D), index: _31, offset: 0B] = 0; >> iter_21 = step_14 + iter_29; >> if (len_16(D) >= iter_21) >> goto <bb 6>; >> else >> goto <bb 7>; >> >> <bb 6>: >> goto <bb 5>; >> >> >> I think this is a corner case, it only changes several files' assembly > code >> slightly in spec2k6. Among these files, only one of them is regression. > I >> looked into the regression and thought it was because of passes after > IVOPT. >> The IVOPT dump is at least not worse than the original version. >> >> Bootstrap and test on x86_64 and AArch64, so is it OK? >> >> 2015-02-09 Bin Cheng <bin.ch...@arm.com> >> >> PR tree-optimization/64705 >> * tree-ssa-loop-niter.h (expand_simple_operations): New >> parameter. >> * tree-ssa-loop-niter.c (expand_simple_operations): New parameter. >> (tree_simplify_using_condition_1, refine_bounds_using_guard) >> (number_of_iterations_exit): Pass new argument to >> expand_simple_operations. >> * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New. >> (find_bivs, find_givs_in_stmt_scev): Pass new argument to >> expand_simple_operations. Call extract_single_var_from_expr. >> (difference_cannot_overflow_p): Pass new argument to >> expand_simple_operations. >> >> gcc/testsuite/ChangeLog >> 2015-02-09 Bin Cheng <bin.ch...@arm.com> >> >> PR tree-optimization/64705 >> * gcc.dg/tree-ssa/pr64705.c: New test. >> >> >> >>