On Sun, Oct 16, 2011 at 12:05 PM, Tom de Vries <tom_devr...@mentor.com> wrote: > On 10/14/2011 12:00 PM, Richard Guenther wrote: >> On Fri, Oct 14, 2011 at 1:12 AM, Tom de Vries <tom_devr...@mentor.com> wrote: >>> On 10/12/2011 02:19 PM, Richard Guenther wrote: >>>> On Wed, Oct 12, 2011 at 8:35 AM, Tom de Vries <vr...@codesourcery.com> >>>> wrote: >>>>> Richard, >>>>> >>>>> I have a patch for PR50672. >>>>> >>>>> When compiling the testcase from the PR with -ftree-tail-merge, the >>>>> scenario is >>>>> as follows: >>>>> >>>>> We start out tail_merge_optimize with blocks 14 and 20, which are alike, >>>>> but not >>>>> equal, since they have different successors: >>>>> ... >>>>> # BLOCK 14 freq:690 >>>>> # PRED: 25 [61.0%] (false,exec) >>>>> >>>>> if (wD.2197_57(D) != 0B) >>>>> goto <bb 15>; >>>>> else >>>>> goto <bb 16>; >>>>> # SUCC: 15 [78.4%] (true,exec) 16 [21.6%] (false,exec) >>>>> >>>>> >>>>> # BLOCK 20 freq:2900 >>>>> # PRED: 29 [100.0%] (fallthru) 31 [100.0%] (fallthru) >>>>> >>>>> # .MEMD.2447_209 = PHI <.MEMD.2447_125(29), .MEMD.2447_129(31)> >>>>> if (wD.2197_57(D) != 0B) >>>>> goto <bb 5>; >>>>> else >>>>> goto <bb 6>; >>>>> # SUCC: 5 [85.0%] (true,exec) 6 [15.0%] (false,exec) >>>>> ... >>>>> >>>>> In the first iteration, we merge block 5 with block 15 and block 6 with >>>>> block >>>>> 16. After that, the blocks 14 and 20 are equal. >>>>> >>>>> In the second iteration, the blocks 14 and 20 are merged, by redirecting >>>>> the >>>>> incoming edges of block 20 to block 14, and removing block 20. >>>>> >>>>> Block 20 also contains the definition of .MEMD.2447_209. Removing the >>>>> definition >>>>> delinks the vuse of .MEMD.2447_209 in block 5: >>>>> ... >>>>> # BLOCK 5 freq:6036 >>>>> # PRED: 20 [85.0%] (true,exec) >>>>> >>>>> # PT = nonlocal escaped >>>>> D.2306_58 = &thisD.2200_10(D)->D.2156; >>>>> # .MEMD.2447_132 = VDEF <.MEMD.2447_209> >>>>> # USE = anything >>>>> # CLB = anything >>>>> drawLineD.2135 (D.2306_58, wD.2197_57(D), gcD.2198_59(D)); >>>>> goto <bb 17>; >>>>> # SUCC: 17 [100.0%] (fallthru,exec) >>>>> ... >>>> >>>> And block 5 is retained and block 15 is discarded? >>>> >>> >>> Indeed. >>> >>>>> After the pass, when executing the TODO_update_ssa_only_virtuals, we >>>>> update the >>>>> drawLine call in block 5 using rewrite_update_stmt, which calls >>>>> maybe_replace_use for the vuse operand. >>>>> >>>>> However, maybe_replace_use doesn't have an effect since the old vuse and >>>>> the new >>>>> vuse happen to be the same (rdef == use), so SET_USE is not called and >>>>> the vuse >>>>> remains delinked: >>>>> ... >>>>> if (rdef && rdef != use) >>>>> SET_USE (use_p, rdef); >>>>> ... >>>>> >>>>> The patch fixes this by forcing SET_USE for delinked uses. >>>> >>>> That isn't the correct fix. Whoever unlinks the vuse (by removing its >>>> definition) has to replace it with something valid, which is either the >>>> bare symbol .MEM, or the VUSE associated with the removed VDEF >>>> (thus, as unlink_stmt_vdef does). >>>> >>> >>> Another try. For each deleted bb, we call unlink_stmt_vdef for the >>> statements, >>> and replace the .MEM phi uses with the bare .MEM symbol. >>> >>> Bootstrapped and reg-tested on x86_64. >>> >>> Ok for trunk? >> >> Better. For >> >> + >> + FOR_EACH_IMM_USE_STMT (use_stmt, iter, res) >> + { >> + FOR_EACH_IMM_USE_ON_STMT (use_p, iter) >> + SET_USE (use_p, SSA_NAME_VAR (res)); >> + } >> >> you can use mark_virtual_phi_result_for_renaming (phi) instead. >> >> + for (i = gsi_last_bb (bb); !gsi_end_p (i); gsi_prev_nondebug (&i)) >> + unlink_stmt_vdef (gsi_stmt (i)); >> >> is that actually necessary? That is, isn't the block that follows a >> deleted block always starting with a virtual PHI? > > Block 20 is deleted. Block 5 follows the deleted block 20. Block 5 does not > start with a virtual PHI. > >> If not it should >> be enough to walk to the first stmt that uses a virtual operand >> and similar to the PHI case replace all its uses with the bare >> symbol. > > I think we need to handle the exposed uses (meaning outside the block) of > vdefs > in each deleted block. The only vdef that can have exposed uses is the last > vdef > in the block. So we should find the last vdef (gimple with gimple_vdef or > virtual PHI) and handle the related uses. > > Bootstrapped and regtested on x86_64. OK for trunk?
Hmmm. I can see that this will fail when the block has a store but not a virtual PHI node, thus, when renaming does not reach a bare .MEM symbol walking the use-def chain from the VUSE of the store. What release_last_vdef should do is trigger a rename of subsequent VUSEs in successors of the deleted block. Which requires you to do something like static void rename_last_vdef (basic_block bb) { + gimple_stmt_iterator i; + + for (i = gsi_last_bb (bb); !gsi_end_p (i); gsi_prev_nondebug (&i)) + { + gimple stmt = gsi_stmt (i); + if (gimple_vdef (stmt) == NULL_TREE) + continue; + + mark_virtual_operand_for_renaming (gimple_vdef (stmt)); return; } + for (i = gsi_start_phis (bb); !gsi_end_p (i); gsi_next (&i)) + { + gimple phi = gsi_stmt (i); + tree res = gimple_phi_result (phi); + + if (is_gimple_reg (res)) + continue; + + mark_virtual_phi_result_for_renaming (phi); + return; + } } And split out the result_var = SSA_NAME_VAR (result_ssa); FOR_EACH_IMM_USE_STMT (stmt, iter, result_ssa) { FOR_EACH_IMM_USE_ON_STMT (use_p, iter) SET_USE (use_p, result_var); update_stmt (stmt); used = true; } if (used) mark_sym_for_renaming (result_var); part of mark_virtual_phi_result_for_renaming into the new function mark_virtual_operand_for_renaming (basically rename it and make mark_virtual_phi_result_for_renaming a wrapper around it, passing gimple_phi_result). Ok with a change as suggested above. Thanks, Richard. > Thanks, > - Tom > >> But as I said, I believe handling PHIs should be sufficient? >> >> Thanks, >> Richard. >> >> >>> Thanks, >>> - Tom >>> >>>> Richard. >>>> >>> > > 2011-10-16 Tom de Vries <t...@codesourcery.com> > > PR tree-optimization/50672 > * tree-ssa-tail-merge.c (release_last_vdef): New function. > (purge_bbs): Add update_vops parameter. Call release_last_vdef for > each > deleted basic block. > (tail_merge_optimize): Add argument to call to purge_bbs. >