On Thu, 16 Feb 2012, Jakub Jelinek wrote: > Hi! > > On this testcase we ICE, because slpeel_tree_peel_loop_to_edge is first > called with a loop that has a virtual PHI and no virtual PHI in the loop > exit bb and this function doesn't update the vop properly after inserting > second loop and adding all the conditional guards, then vect_loop_versioning > calls loop_version which deep inside of it performs > verify_loop_closed_ssa (true) > which ICEs on the invalid SSA. If that check is commented out, we proceed > with optionally doing another slpeel_tree_peel_loop_to_edge and finally > when all loops are vectorized, mark the vop for renaming and update ssa, > which fixes it up. > > Here is an attempt to ensure the vop is updated properly, because calling > update_ssa in each slpeel_tree_peel_loop_to_edge would be IMHO too > expensive. The slpeel_tree_peel_loop_to_edge and its helpers already > DTRT for all other PHIs, the problem with virtual PHIs is that the loop > closed SSA form for other PHIs requires a dummy PHI on the exit bb if > it is used after the loop, but for virtual PHIs it doesn't have this > requirement. The simplest patch below just adds an extra PHI even > for the vop if there isn't any and the loop has it, then the routines in > multiple places just DTRT. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2012-02-16 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/52255 > * tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): If > loop->header has virtual PHI, but exit_e->dest doesn't, add > virtual PHI to exit_e->dest and adjust all uses after the loop. > > * gcc.c-torture/compile/pr52255.c: New test. > > --- gcc/tree-vect-loop-manip.c.jj 2012-01-03 16:22:48.000000000 +0100 > +++ gcc/tree-vect-loop-manip.c 2012-02-15 20:15:27.816120830 +0100 > @@ -1171,6 +1171,7 @@ slpeel_tree_peel_loop_to_edge (struct lo > basic_block bb_before_first_loop; > basic_block bb_between_loops; > basic_block new_exit_bb; > + gimple_stmt_iterator gsi; > edge exit_e = single_exit (loop); > LOC loop_loc; > tree cost_pre_condition = NULL_TREE; > @@ -1184,6 +1185,43 @@ slpeel_tree_peel_loop_to_edge (struct lo > the function tree_duplicate_bb is called. */ > gimple_register_cfg_hooks (); > > + /* If the loop has a virtual PHI, but exit bb doesn't, create a virtual PHI > + in the exit bb and rename all the uses after the loop. This simplifies > + the *guard[12] routines, which assume loop closed SSA form for all PHIs > + (but normally loop closed SSA form doesn't require virtual PHIs to be > + in the same form). Doing this early simplifies the checking what > + uses should be renamed. */ > + for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next > (&gsi)) > + if (!is_gimple_reg (gimple_phi_result (gsi_stmt (gsi)))) > + { > + gimple phi = gsi_stmt (gsi); > + for (gsi = gsi_start_phis (exit_e->dest); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + if (!is_gimple_reg (gimple_phi_result (gsi_stmt (gsi)))) > + break; > + if (gsi_end_p (gsi)) > + { > + gimple new_phi = create_phi_node (SSA_NAME_VAR (PHI_RESULT (phi)), > + exit_e->dest); > + tree vop = PHI_ARG_DEF_FROM_EDGE (phi, EDGE_SUCC (loop->latch, 0)); > + source_location loc > + = gimple_phi_arg_location_from_edge (phi, EDGE_SUCC (loop->latch, > + 0));
I suppose UNKNOWN_LOCATION for virtual PHIs is good enough. I've put down a note "add a 'gimple virtual_phi_node (basic_block)' helper - if we canonicalize the PHI seq to make the single virtual PHI node come first we can even speed that up". Feel free to add that helper now. Ok as-is or with either or both suggested changes. Thanks, Richard. > + imm_use_iterator imm_iter; > + gimple stmt; > + tree new_vop = make_ssa_name (SSA_NAME_VAR (PHI_RESULT (phi)), > + new_phi); > + use_operand_p use_p; > + > + add_phi_arg (new_phi, vop, exit_e, loc); > + gimple_phi_set_result (new_phi, new_vop); > + FOR_EACH_IMM_USE_STMT (stmt, imm_iter, vop) > + if (stmt != new_phi && gimple_bb (stmt) != loop->header) > + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) > + SET_USE (use_p, new_vop); > + } > + break; > + } > > /* 1. Generate a copy of LOOP and put it on E (E is the entry/exit of > LOOP). > Resulting CFG would be: > --- gcc/testsuite/gcc.c-torture/compile/pr52255.c.jj 2012-02-15 > 20:16:23.060799290 +0100 > +++ gcc/testsuite/gcc.c-torture/compile/pr52255.c 2012-02-15 > 20:16:05.000000000 +0100 > @@ -0,0 +1,12 @@ > +/* PR tree-optimization/52255 */ > + > +int a, b, c[10], d[10] = { 0, 0 }; > + > +void > +foo (void) > +{ > + for (a = 1; a <= 4; a += 1) > + d[a] = d[1]; > + for (; b; ++b) > + c[0] |= 1; > +} > > Jakub > > -- Richard Guenther <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer