On Thu, Feb 2, 2012 at 11:44 AM, Tom de Vries <tom_devr...@mentor.com> wrote: > Richard, > > this patch fixes PR52801. > > Consider test-case pr51879-12.c: > ... > __attribute__((pure)) int bar (int); > __attribute__((pure)) int bar2 (int); > void baz (int); > > int x, z; > > void > foo (int y) > { > int a = 0; > if (y == 6) > { > a += bar (7); > a += bar2 (6); > } > else > { > a += bar2 (6); > a += bar (7); > } > baz (a); > } > ... > > When compiling at -O2, pr51879-12.c.094t.pre looks like this: > ... > # BLOCK 3 freq:1991 > # PRED: 2 [19.9%] (true,exec) > # VUSE <.MEMD.1722_12(D)> > # USE = nonlocal escaped > D.1717_4 = barD.1703 (7); > # VUSE <.MEMD.1722_12(D)> > # USE = nonlocal escaped > D.1718_6 = bar2D.1705 (6); > aD.1713_7 = D.1717_4 + D.1718_6; > goto <bb 5>; > # SUCC: 5 [100.0%] (fallthru,exec) > > # BLOCK 4 freq:8009 > # PRED: 2 [80.1%] (false,exec) > # VUSE <.MEMD.1722_12(D)> > # USE = nonlocal escaped > D.1720_8 = bar2D.1705 (6); > # VUSE <.MEMD.1722_12(D)> > # USE = nonlocal escaped > D.1721_10 = barD.1703 (7); > aD.1713_11 = D.1720_8 + D.1721_10; > # SUCC: 5 [100.0%] (fallthru,exec) > > # BLOCK 5 freq:10000 > # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec) > # aD.1713_1 = PHI <aD.1713_7(3), aD.1713_11(4)> > # .MEMD.1722_13 = VDEF <.MEMD.1722_12(D)> > # USE = nonlocal > # CLB = nonlocal > bazD.1707 (aD.1713_1); > # VUSE <.MEMD.1722_13> > return; > ... > block 3 and 4 can be tail-merged. > > Value numbering numbers the two phi arguments a_7 and a_11 the same so the > problem is not in value numbering: > ... > Setting value number of a_11 to a_7 (changed) > ... > > There are 2 reasons that tail_merge_optimize doesn't optimize this: > > 1. > The clause > is_gimple_assign (stmt) && local_def (gimple_get_lhs (stmt)) > && !gimple_has_side_effects (stmt) > used in both same_succ_hash and gsi_advance_bw_nondebug_nonlocal evaluates to > false for pure calls. > This is fixed by replacing is_gimple_assign with gimple_has_lhs. > > 2. > In same_succ_equal we check gimples from the 2 bbs side-by-side: > ... > gsi1 = gsi_start_nondebug_bb (bb1); > gsi2 = gsi_start_nondebug_bb (bb2); > while (!(gsi_end_p (gsi1) || gsi_end_p (gsi2))) > { > s1 = gsi_stmt (gsi1); > s2 = gsi_stmt (gsi2); > if (gimple_code (s1) != gimple_code (s2)) > return 0; > if (is_gimple_call (s1) && !gimple_call_same_target_p (s1, s2)) > return 0; > gsi_next_nondebug (&gsi1); > gsi_next_nondebug (&gsi2); > } > ... > and we'll be comparing 'bar (7)' and 'bar2 (6)', and gimple_call_same_target_p > will return false. > This is fixed by ignoring local defs in this comparison, by using > gsi_advance_fw_nondebug_nonlocal on the iterators. > > bootstrapped and reg-tested on x86_64. > > ok for stage1?
Sorry for responding so late ... I think these fixes hint at that we should use "structural" equality as fallback if value-numbering doesn't equate two stmt effects. Thus, treat two stmts with exactly the same operands and flags as equal and using value-numbering to canonicalize operands (when they are SSA names) for that comparison, or use VN entirely if there are no side-effects on the stmt. Changing value-numbering of virtual operands, even if it looks correct in the simple cases you change, doesn't look like a general solution for the missed tail merging opportunities. Richard. > Thanks, > - Tom > > 2012-02-02 Tom de Vries <t...@codesourcery.com> > > * tree-ssa-tail-merge.c (local_def): Move up. > (stmt_local_def): New function, factored out of same_succ_hash. Use > gimple_has_lhs instead of is_gimple_assign. > (gsi_advance_nondebug_nonlocal): New function, factored out of > gsi_advance_bw_nondebug_nonlocal. Use stmt_local_def. > (gsi_advance_fw_nondebug_nonlocal): New function. > (gsi_advance_bw_nondebug_nonlocal): Use gsi_advance_nondebug_nonlocal. > Move up. > (same_succ_hash): Use stmt_local_def. > (same_succ_equal): Use gsi_advance_fw_nondebug_nonlocal. > > * gcc.dg/pr51879-12.c: New test.