On Thu, Feb 2, 2012 at 11:44 AM, Tom de Vries <tom_devr...@mentor.com> wrote:
> Richard,
>
> this patch fixes PR52801.
>
> Consider test-case pr51879-12.c:
> ...
> __attribute__((pure)) int bar (int);
> __attribute__((pure)) int bar2 (int);
> void baz (int);
>
> int x, z;
>
> void
> foo (int y)
> {
>  int a = 0;
>  if (y == 6)
>    {
>      a += bar (7);
>      a += bar2 (6);
>    }
>  else
>    {
>      a += bar2 (6);
>      a += bar (7);
>    }
>  baz (a);
> }
> ...
>
> When compiling at -O2, pr51879-12.c.094t.pre looks like this:
> ...
>  # BLOCK 3 freq:1991
>  # PRED: 2 [19.9%]  (true,exec)
>  # VUSE <.MEMD.1722_12(D)>
>  # USE = nonlocal escaped
>  D.1717_4 = barD.1703 (7);
>  # VUSE <.MEMD.1722_12(D)>
>  # USE = nonlocal escaped
>  D.1718_6 = bar2D.1705 (6);
>  aD.1713_7 = D.1717_4 + D.1718_6;
>  goto <bb 5>;
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 4 freq:8009
>  # PRED: 2 [80.1%]  (false,exec)
>  # VUSE <.MEMD.1722_12(D)>
>  # USE = nonlocal escaped
>  D.1720_8 = bar2D.1705 (6);
>  # VUSE <.MEMD.1722_12(D)>
>  # USE = nonlocal escaped
>  D.1721_10 = barD.1703 (7);
>  aD.1713_11 = D.1720_8 + D.1721_10;
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 5 freq:10000
>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>  # aD.1713_1 = PHI <aD.1713_7(3), aD.1713_11(4)>
>  # .MEMD.1722_13 = VDEF <.MEMD.1722_12(D)>
>  # USE = nonlocal
>  # CLB = nonlocal
>  bazD.1707 (aD.1713_1);
>  # VUSE <.MEMD.1722_13>
>  return;
> ...
> block 3 and 4 can be tail-merged.
>
> Value numbering numbers the two phi arguments a_7 and a_11 the same so the
> problem is not in value numbering:
> ...
> Setting value number of a_11 to a_7 (changed)
> ...
>
> There are 2 reasons that tail_merge_optimize doesn't optimize this:
>
> 1.
> The clause
>  is_gimple_assign (stmt) && local_def (gimple_get_lhs (stmt))
>  && !gimple_has_side_effects (stmt)
> used in both same_succ_hash and gsi_advance_bw_nondebug_nonlocal evaluates to
> false for pure calls.
> This is fixed by replacing is_gimple_assign with gimple_has_lhs.
>
> 2.
> In same_succ_equal we check gimples from the 2 bbs side-by-side:
> ...
>  gsi1 = gsi_start_nondebug_bb (bb1);
>  gsi2 = gsi_start_nondebug_bb (bb2);
>  while (!(gsi_end_p (gsi1) || gsi_end_p (gsi2)))
>    {
>      s1 = gsi_stmt (gsi1);
>      s2 = gsi_stmt (gsi2);
>      if (gimple_code (s1) != gimple_code (s2))
>        return 0;
>      if (is_gimple_call (s1) && !gimple_call_same_target_p (s1, s2))
>        return 0;
>      gsi_next_nondebug (&gsi1);
>      gsi_next_nondebug (&gsi2);
>    }
> ...
> and we'll be comparing 'bar (7)' and 'bar2 (6)', and gimple_call_same_target_p
> will return false.
> This is fixed by ignoring local defs in this comparison, by using
> gsi_advance_fw_nondebug_nonlocal on the iterators.
>
> bootstrapped and reg-tested on x86_64.
>
> ok for stage1?

Sorry for responding so late ... I think these fixes hint at that we should
use "structural" equality as fallback if value-numbering doesn't equate
two stmt effects.  Thus, treat two stmts with exactly the same operands
and flags as equal and using value-numbering to canonicalize operands
(when they are SSA names) for that comparison, or use VN entirely
if there are no side-effects on the stmt.

Changing value-numbering of virtual operands, even if it looks correct in the
simple cases you change, doesn't look like a general solution for the missed
tail merging opportunities.

Richard.

> Thanks,
> - Tom
>
> 2012-02-02  Tom de Vries  <t...@codesourcery.com>
>
>        * tree-ssa-tail-merge.c (local_def): Move up.
>        (stmt_local_def): New function, factored out of same_succ_hash.  Use
>        gimple_has_lhs instead of is_gimple_assign.
>        (gsi_advance_nondebug_nonlocal): New function, factored out of
>        gsi_advance_bw_nondebug_nonlocal.  Use stmt_local_def.
>        (gsi_advance_fw_nondebug_nonlocal): New function.
>        (gsi_advance_bw_nondebug_nonlocal): Use gsi_advance_nondebug_nonlocal.
>        Move up.
>        (same_succ_hash): Use stmt_local_def.
>        (same_succ_equal): Use gsi_advance_fw_nondebug_nonlocal.
>
>        * gcc.dg/pr51879-12.c: New test.

Reply via email to