Ping. Teresa On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson <tejohn...@google.com> wrote: > Revised patch that fixes failures encountered when enabling > -freorder-blocks-and-partition, including the failure reported in PR 53743. > > This includes new verification code to ensure no cold blocks dominate hot > blocks contributed by Steven Bosscher. > > I attempted to make the handling of partition updates through the optimization > passes much more consistent, removing a number of partial fixes in the code > stream in the process. The code to fixup partitions (including the > BB_PARTITION > assignement, region crossing jump notes, and switch text section notes) is > now handled in a few centralized locations. For example, inside > rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that > callers > don't need to attempt the fixup themselves. > > For optimization passes that make adjustments to the cfg while in cfg layout > mode that are not easy to fix up incrementally, the new routine > fixup_partitions handles the cleanup globally. This does require calculation > of the dominance relation, however, as far as I can tell the routines which > now invoke this global fixup (try_optimize_cfg and commit_edge_insertions) > are invoked typically once (or a small number of times in the case of > try_optimize_cfg) per optimization pass. Additionally, I compared the > -ftime-report output for some large fdo compilations and saw only minimal > increases in the dominance computation times, which were only a tiny percent > of the overall compile time. > > Additionally, I added a flag to the rtl_data structure to indicate whether > any partitioning was actually performed, so that optimizations which were > conservatively disabled whenever the flag_reorder_blocks_and_partition > is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less > conservative for functions where no partitions were formed (e.g. they are > completely hot). > > Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with > SPEC2006 int > benchmarks and internal google benchmarks using profile feedback and > -freorder-blocks-and-partition to get more coverage. Ok for trunk? > > Thanks, > Teresa > > 2012-11-14 Teresa Johnson <tejohn...@google.com> > Steven Bosscher <ste...@gcc.gnu.org> > > * cfghooks.h (cfg_layout_finalize): New parameter. > * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize > parameter. > * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert > as this is now done by redirect_edge_and_branch_force. > * function.c (thread_prologue_and_epilogue_insns): Insert new bb after > barriers, new cfg_layout_finalize parameter, and don't store exit > predecessor BB until after it is potentially split. > * function.h (struct rtl_data): New flag has_bb_partition. > * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter. > * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if > any blocks in function actually partitioned. > (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean > up partitioning. > * bb-reorder.c (connect_traces): Only look for partitions and skip > block copying if any blocks in function actually partitioned. > (emit_barrier_after_bb): Handle insertion in non-cfglayout mode. > (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure > that no cold blocks dominate a hot block. > (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert > as this is now done by force_nonfallthru_and_redirect. > (add_reg_crossing_jump_notes): Handle the fact that some jumps may > already be marked with region crossing note. > (reorder_basic_blocks): Only need to verify partitions if any > blocks in function actually partitioned. > (insert_section_boundary_note): Only need to insert note if any > blocks in function actually partitioned. > (rest_of_handle_reorder_blocks): New cfg_layout_finalize > parameter, and remove call to insert_section_boundary_note as this > is now called via cfg_layout_finalize/fixup_reorder_chain. > (duplicate_computed_gotos): New cfg_layout_finalize > parameter. > (partition_hot_cold_basic_blocks): Set flag indicating function > has bb partitions. > * bb-reorder.h: Declare insert_section_boundary_note and > emit_barrier_after_bb, which are no longer static. > * basic-block.h: Declare new function fixup_partitions. > * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary > check for region crossing note. > (fixup_partition_crossing): New function. > (fixup_bb_partition): Ditto. > (rtl_redirect_edge_and_branch): Fixup partition boundaries. > (force_nonfallthru_and_redirect): Fixup partition boundaries, > remove old code that tried to do this. Emit barrier correctly > when we are in cfglayout mode. > (rtl_split_edge): Correctly fixup partition boundaries. > (commit_one_edge_insertion): Remove old code that tried to > fixup region crossing edge since this is now handled in > split_block, and set up insertion point correctly since > block may now end in a jump. > (commit_edge_insertions): Invoke fixup_partitions to sanitize > partition > boundaries after optimizations that modify cfg and before trying to > verify the flow info. > (fixup_partitions): New function. > (rtl_verify_flow_info_1): Add verification that no cold bbs dominate > hot bbs. > (record_effective_endpoints): Remove region-crossing notes and set > flag > indicating that they need to be reinserted on exit from cfglayout > mode. > (outof_cfg_layout_mode): New cfg_layout_finalize parameter. > (fixup_reorder_chain): Call insert_section_boundary_note if necessary. > Remove old code that attempted to fixup region crossing note as > this is now handled in force_nonfallthru_and_redirect. > (duplicate_insn_chain): Don't duplicate switch section notes. > (cfg_layout_finalize): Pass new parameter to fixup_reorder_chain. > (rtl_can_remove_branch_p): Remove unnecessary check for region > crossing > note. > > Index: cfghooks.h > =================================================================== > --- cfghooks.h (revision 193376) > +++ cfghooks.h (working copy) > @@ -204,7 +204,7 @@ extern void copy_bbs (basic_block *, unsigned, bas > void account_profile_record (struct profile_record *, int); > > extern void cfg_layout_initialize (unsigned int); > -extern void cfg_layout_finalize (void); > +extern void cfg_layout_finalize (bool); > > /* Hooks containers. */ > extern struct cfg_hooks gimple_cfg_hooks; > @@ -218,4 +218,3 @@ extern void cfg_layout_rtl_register_cfg_hooks (voi > extern void gimple_register_cfg_hooks (void); > extern struct cfg_hooks get_cfg_hooks (void); > extern void set_cfg_hooks (struct cfg_hooks); > - > Index: modulo-sched.c > =================================================================== > --- modulo-sched.c (revision 193376) > +++ modulo-sched.c (working copy) > @@ -3354,7 +3354,7 @@ rest_of_handle_sms (void) > if (bb->next_bb != EXIT_BLOCK_PTR) > bb->aux = bb->next_bb; > free_dominance_info (CDI_DOMINATORS); > - cfg_layout_finalize (); > + cfg_layout_finalize (false); > #endif /* INSN_SCHEDULING */ > return 0; > } > Index: ifcvt.c > =================================================================== > --- ifcvt.c (revision 193376) > +++ ifcvt.c (working copy) > @@ -3900,10 +3900,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg > if (new_bb) > { > df_bb_replace (then_bb_index, new_bb); > - /* Since the fallthru edge was redirected from test_bb to new_bb, > - we need to ensure that new_bb is in the same partition as > - test bb (you can not fall through across section boundaries). */ > - BB_COPY_PARTITION (new_bb, test_bb); > + /* This should have been done above via force_nonfallthru_and_redirect > + (possibly called from redirect_edge_and_branch_force). */ > + gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); > } > > num_true_changes++; > Index: function.c > =================================================================== > --- function.c (revision 193376) > +++ function.c (working copy) > @@ -6249,8 +6249,10 @@ thread_prologue_and_epilogue_insns (void) > break; > if (e) > { > - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)), > - NULL_RTX, e->src); > + /* Make sure we insert after any barriers. */ > + rtx end = get_last_bb_insn (e->src); > + copy_bb = create_basic_block (NEXT_INSN (end), > + NULL_RTX, e->src); > BB_COPY_PARTITION (copy_bb, e->src); > } > else > @@ -6475,7 +6477,7 @@ thread_prologue_and_epilogue_insns (void) > if (cur_bb->index >= NUM_FIXED_BLOCKS > && cur_bb->next_bb->index >= NUM_FIXED_BLOCKS) > cur_bb->aux = cur_bb->next_bb; > - cfg_layout_finalize (); > + cfg_layout_finalize (false); > } > > epilogue_done: > @@ -6517,7 +6519,7 @@ epilogue_done: > basic_block simple_return_block_cold = NULL; > edge pending_edge_hot = NULL; > edge pending_edge_cold = NULL; > - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; > + basic_block exit_pred; > int i; > > gcc_assert (entry_edge != orig_entry_edge); > @@ -6545,6 +6547,12 @@ epilogue_done: > else > pending_edge_cold = e; > } > + > + /* Save a pointer to the exit's predecessor BB for use in > + inserting new BBs at the end of the function. Do this > + after the call to split_block above which may split > + the original exit pred. */ > + exit_pred = EXIT_BLOCK_PTR->prev_bb; > > FOR_EACH_VEC_ELT (edge, unconverted_simple_returns, i, e) > { > Index: function.h > =================================================================== > --- function.h (revision 193376) > +++ function.h (working copy) > @@ -459,6 +459,11 @@ struct GTY(()) rtl_data { > sched2) and is useful only if the port defines LEAF_REGISTERS. */ > bool uses_only_leaf_regs; > > + /* Nonzero if the function being compiled has undergone hot/cold > partitioning > + (under flag_reorder_blocks_and_partition) and has at least one cold > + block. */ > + bool has_bb_partition; > + > /* Like regs_ever_live, but 1 if a reg is set or clobbered from an > asm. Unlike regs_ever_live, elements of this array corresponding > to eliminable regs (like the frame pointer) are set if an asm > Index: hw-doloop.c > =================================================================== > --- hw-doloop.c (revision 193376) > +++ hw-doloop.c (working copy) > @@ -547,7 +547,7 @@ reorder_loops (hwloop_info loops) > else > bb->aux = NULL; > } > - cfg_layout_finalize (); > + cfg_layout_finalize (false); > clear_aux_for_blocks (); > df_analyze (); > } > Index: cfgcleanup.c > =================================================================== > --- cfgcleanup.c (revision 193376) > +++ cfgcleanup.c (working copy) > @@ -1824,7 +1824,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2, > partition boundaries). See the comments at the top of > bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ > > - if (flag_reorder_blocks_and_partition && reload_completed) > + if (crtl->has_bb_partition && reload_completed) > return false; > > /* Search backward through forwarder blocks. We don't need to worry > @@ -2767,10 +2767,21 @@ try_optimize_cfg (int mode) > df_analyze (); > } > > + if (changed) > + { > + /* Edge forwarding in particular can cause hot blocks > previously > + reached by both hot and cold blocks to become dominated only > + by cold blocks. This will cause the verification below to > fail, > + and lead to now cold code in the hot section. This is not > easy > + to detect and fix during edge forwarding, and in some cases > + is only visible after newly unreachable blocks are deleted, > + which will be done in fixup_partitions. */ > + fixup_partitions (); > + > #ifdef ENABLE_CHECKING > - if (changed) > - verify_flow_info (); > + verify_flow_info (); > #endif > + } > > changed_overall |= changed; > first_pass = false; > Index: bb-reorder.c > =================================================================== > --- bb-reorder.c (revision 193376) > +++ bb-reorder.c (working copy) > @@ -1054,7 +1054,7 @@ connect_traces (int n_traces, struct trace *traces > current_partition = BB_PARTITION (traces[0].first); > two_passes = false; > > - if (flag_reorder_blocks_and_partition) > + if (crtl->has_bb_partition) > for (i = 0; i < n_traces && !two_passes; i++) > if (BB_PARTITION (traces[0].first) > != BB_PARTITION (traces[i].first)) > @@ -1263,7 +1263,7 @@ connect_traces (int n_traces, struct trace *traces > } > } > > - if (flag_reorder_blocks_and_partition) > + if (crtl->has_bb_partition) > try_copy = false; > > /* Copy tiny blocks always; copy larger blocks only when the > @@ -1381,13 +1381,14 @@ get_uncond_jump_length (void) > return length; > } > > -/* Emit a barrier into the footer of BB. */ > +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ > > -static void > +void > emit_barrier_after_bb (basic_block bb) > { > rtx barrier = emit_barrier_after (BB_END (bb)); > - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); > + if (current_ir_type () == IR_RTL_CFGLAYOUT) > + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); > } > > /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. > @@ -1463,18 +1464,109 @@ find_rarely_executed_basic_blocks_and_crossing_edg > { > VEC(edge, heap) *crossing_edges = NULL; > basic_block bb; > - edge e; > - edge_iterator ei; > + edge e, e2; > + edge_iterator ei, ei2; > + unsigned int cold_bb_count = 0; > + VEC (basic_block, heap) *bbs_in_hot_partition = NULL; > + VEC (basic_block, heap) *bbs_newly_hot = NULL; > > /* Mark which partition (hot/cold) each basic block belongs in. */ > FOR_EACH_BB (bb) > { > if (probably_never_executed_bb_p (cfun, bb)) > - BB_SET_PARTITION (bb, BB_COLD_PARTITION); > + { > + BB_SET_PARTITION (bb, BB_COLD_PARTITION); > + cold_bb_count++; > + } > else > - BB_SET_PARTITION (bb, BB_HOT_PARTITION); > + { > + BB_SET_PARTITION (bb, BB_HOT_PARTITION); > + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, bb); > + } > } > > + /* Ensure that no cold bbs dominate hot bbs. This could happen as a result > of > + several different possibilities. One is that there are edge weight > insanities > + due to optimization phases that do not properly update basic block > profile > + counts. The second is that the entry of the function may not be hot, > because > + it is entered fewer times than the number of profile training runs, but > there > + is a loop inside the function that causes blocks within the function to > be > + above the threshold for hotness. */ > + if (cold_bb_count) > + { > + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); > + > + if (dom_calculated_here) > + calculate_dominance_info (CDI_DOMINATORS); > + > + /* Keep examining hot bbs until we have either checked them all, or > + re-marked all cold bbs hot. */ > + while (! VEC_empty (basic_block, bbs_in_hot_partition) > + && cold_bb_count) > + { > + basic_block dom_bb; > + > + bb = VEC_pop (basic_block, bbs_in_hot_partition); > + dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb); > + > + /* If bb's immediate dominator is also hot then it is ok. */ > + if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION) > + continue; > + > + /* We have a hot bb with an immediate dominator that is cold. > + The dominator needs to be re-marked to hot. */ > + BB_SET_PARTITION (dom_bb, BB_HOT_PARTITION); > + cold_bb_count--; > + > + /* Now we need to examine newly-hot dom_bb to see if it is also > + dominated by a cold bb. */ > + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, dom_bb); > + > + /* We should also adjust any cold blocks that the newly-hot bb > + feeds and see if it makes sense to re-mark those as hot as > + well. */ > + VEC_safe_push (basic_block, heap, bbs_newly_hot, dom_bb); > + while (! VEC_empty (basic_block, bbs_newly_hot)) > + { > + basic_block new_hot_bb = VEC_pop (basic_block, bbs_newly_hot); > + /* Examine all successors of this newly-hot bb to see if they > + are cold and should be re-marked as hot. */ > + FOR_EACH_EDGE (e, ei, new_hot_bb->succs) > + { > + bool any_cold_preds = false; > + basic_block succ = e->dest; > + if (BB_PARTITION (succ) != BB_COLD_PARTITION) > + continue; > + /* Does this block have any cold predecessors now? */ > + FOR_EACH_EDGE (e2, ei2, succ->preds) > + { > + if (BB_PARTITION (e2->src) == BB_COLD_PARTITION) > + { > + any_cold_preds = true; > + break; > + } > + } > + if (any_cold_preds) > + continue; > + > + /* Here we have a successor of newly-hot bb that is cold > + but no longer has any cold precessessors. Since the > original > + assignment of our newly-hot bb was incorrect, this > successor's > + assignment as cold is also suspect. Go ahead and > re-mark it > + as hot now too. Better heuristics may be in order here. > */ > + BB_SET_PARTITION (succ, BB_HOT_PARTITION); > + cold_bb_count--; > + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, > succ); > + /* Examine this successor as a newly-hot bb. */ > + VEC_safe_push (basic_block, heap, bbs_newly_hot, succ); > + } > + } > + } > + > + if (dom_calculated_here) > + free_dominance_info (CDI_DOMINATORS); > + } > + > /* The format of .gcc_except_table does not allow landing pads to > be in a different partition as the throw. Fix this by either > moving or duplicating the landing pads. */ > @@ -1766,10 +1858,10 @@ fix_up_fall_thru_edges (void) > new_bb->aux = cur_bb->aux; > cur_bb->aux = new_bb; > > - /* Make sure new fall-through bb is in same > - partition as bb it's falling through from. */ > + /* This is done by force_nonfallthru_and_redirect. */ > + gcc_assert (BB_PARTITION (new_bb) > + == BB_PARTITION (cur_bb)); > > - BB_COPY_PARTITION (new_bb, cur_bb); > single_succ_edge (new_bb)->flags |= EDGE_CROSSING; > } > else > @@ -2067,7 +2159,10 @@ add_reg_crossing_jump_notes (void) > FOR_EACH_BB (bb) > FOR_EACH_EDGE (e, ei, bb->succs) > if ((e->flags & EDGE_CROSSING) > - && JUMP_P (BB_END (e->src))) > + && JUMP_P (BB_END (e->src)) > + /* Some notes were added during fix_up_fall_thru_edges, via > + force_nonfallthru_and_redirect. */ > + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) > add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > } > > @@ -2160,7 +2255,7 @@ reorder_basic_blocks (void) > dump_flow_info (dump_file, dump_flags); > } > > - if (flag_reorder_blocks_and_partition) > + if (crtl->has_bb_partition) > verify_hot_cold_block_grouping (); > } > > @@ -2172,14 +2267,14 @@ reorder_basic_blocks (void) > encountering this note will make the compiler switch between the > hot and cold text sections. */ > > -static void > +void > insert_section_boundary_note (void) > { > basic_block bb; > rtx new_note; > int first_partition = 0; > > - if (!flag_reorder_blocks_and_partition) > + if (!crtl->has_bb_partition) > return; > > FOR_EACH_BB (bb) > @@ -2222,10 +2317,8 @@ rest_of_handle_reorder_blocks (void) > FOR_EACH_BB (bb) > if (bb->next_bb != EXIT_BLOCK_PTR) > bb->aux = bb->next_bb; > - cfg_layout_finalize (); > + cfg_layout_finalize (true); > > - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ > - insert_section_boundary_note (); > return 0; > } > > @@ -2366,7 +2459,7 @@ duplicate_computed_gotos (void) > } > > done: > - cfg_layout_finalize (); > + cfg_layout_finalize (false); > > BITMAP_FREE (candidates); > return 0; > @@ -2511,6 +2604,8 @@ partition_hot_cold_basic_blocks (void) > if (crossing_edges == NULL) > return 0; > > + crtl->has_bb_partition = true; > + > /* Make sure the source of any crossing edge ends in a jump and the > destination of any crossing edge has a label. */ > add_labels_and_missing_jumps (crossing_edges); > Index: bb-reorder.h > =================================================================== > --- bb-reorder.h (revision 193376) > +++ bb-reorder.h (working copy) > @@ -36,4 +36,8 @@ extern struct target_bb_reorder *this_target_bb_re > > extern int get_uncond_jump_length (void); > > +extern void insert_section_boundary_note (void); > + > +extern void emit_barrier_after_bb (basic_block bb); > + > #endif > Index: basic-block.h > =================================================================== > --- basic-block.h (revision 193376) > +++ basic-block.h (working copy) > @@ -806,6 +806,7 @@ extern basic_block force_nonfallthru_and_redirect > extern bool contains_no_active_insn_p (const_basic_block); > extern bool forwarder_block_p (const_basic_block); > extern bool can_fallthru (basic_block, basic_block); > +extern void fixup_partitions (void); > > /* In cfgbuild.c. */ > extern void find_many_sub_basic_blocks (sbitmap); > Index: cfgrtl.c > =================================================================== > --- cfgrtl.c (revision 193376) > +++ cfgrtl.c (working copy) > @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree.h" > #include "hard-reg-set.h" > #include "basic-block.h" > +#include "bb-reorder.h" > #include "regs.h" > #include "flags.h" > #include "function.h" > @@ -67,11 +68,12 @@ along with GCC; see the file COPYING3. If not see > Only applicable if the CFG is in cfglayout mode. */ > static GTY(()) rtx cfg_layout_function_footer; > static GTY(()) rtx cfg_layout_function_header; > +static bool had_sec_boundary_notes; > > static rtx skip_insns_after_block (basic_block); > static void record_effective_endpoints (void); > static rtx label_for_bb (basic_block); > -static void fixup_reorder_chain (void); > +static void fixup_reorder_chain (bool finalize_reorder_blocks); > > void verify_insn_chain (void); > static void fixup_fallthru_exit_predecessor (void); > @@ -976,8 +978,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc > partition boundaries). See the comments at the top of > bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ > > - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) > - || BB_PARTITION (src) != BB_PARTITION (target)) > + if (BB_PARTITION (src) != BB_PARTITION (target)) > return NULL; > > /* We can replace or remove a complex jump only when we have exactly > @@ -1286,6 +1287,71 @@ redirect_branch_edge (edge e, basic_block target) > return e; > } > > +/* Called when edge E has been redirected to a new destination, > + in order to update the region crossing flag on the edge and > + jump. */ > + > +static void > +fixup_partition_crossing (edge e, basic_block target) > +{ > + rtx note; > + > + gcc_assert (e->dest == target); > + > + if (e->src == ENTRY_BLOCK_PTR || target == EXIT_BLOCK_PTR) > + return; > + /* If we redirected an existing edge, it may already be marked > + crossing, even though the new src is missing a reg crossing note. > + But make sure reg crossing note doesn't already exist before > + inserting. */ > + if (BB_PARTITION (e->src) != BB_PARTITION (target)) > + { > + e->flags |= EDGE_CROSSING; > + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + if (JUMP_P (BB_END (e->src)) > + && !note) > + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + } > + else if (BB_PARTITION (e->src) == BB_PARTITION (target)) > + { > + e->flags &= ~EDGE_CROSSING; > + /* Remove the region crossing note from jump at end of > + e->src if it exists. */ > + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + if (note) > + remove_note (BB_END (e->src), note); > + } > +} > + > +/* Called when block BB has been reassigned to a different partition, > + to ensure that the region crossing attributes are updated. */ > + > +static void > +fixup_bb_partition (basic_block bb) > +{ > + edge e; > + edge_iterator ei; > + > + /* Now need to make bb's pred edges non-region crossing. */ > + FOR_EACH_EDGE (e, ei, bb->preds) > + { > + fixup_partition_crossing (e, e->dest); > + } > + > + /* Possibly need to make bb's successor edges region crossing, > + or remove stale region crossing. */ > + FOR_EACH_EDGE (e, ei, bb->succs) > + { > + if ((e->flags & EDGE_FALLTHRU) > + && BB_PARTITION (bb) != BB_PARTITION (e->dest) > + && e->dest != EXIT_BLOCK_PTR) > + /* force_nonfallthru_and_redirect calls fixup_partition_crossing. */ > + force_nonfallthru (e); > + else > + fixup_partition_crossing (e, e->dest); > + } > +} > + > /* Attempt to change code to redirect edge E to TARGET. Don't do that on > expense of adding new instructions or reordering basic blocks. > > @@ -1302,16 +1368,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block > { > edge ret; > basic_block src = e->src; > + basic_block dest = e->dest; > > if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) > return NULL; > > - if (e->dest == target) > + if (dest == target) > return e; > > if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) > { > df_set_bb_dirty (src); > + fixup_partition_crossing (ret, target); > return ret; > } > > @@ -1320,6 +1388,7 @@ rtl_redirect_edge_and_branch (edge e, basic_block > return NULL; > > df_set_bb_dirty (src); > + fixup_partition_crossing (ret, target); > return ret; > } > > @@ -1454,18 +1523,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc > /* Make sure new block ends up in correct hot/cold section. */ > > BB_COPY_PARTITION (jump_block, e->src); > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && JUMP_P (BB_END (jump_block)) > - && !any_condjump_p (BB_END (jump_block)) > - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) > - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); > > /* Wire edge in. */ > new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); > new_edge->probability = probability; > new_edge->count = count; > > + /* If e->src was previously region crossing, it no longer is > + and the reg crossing note should be removed. */ > + fixup_partition_crossing (new_edge, jump_block); > + > /* Redirect old edge. */ > redirect_edge_pred (e, jump_block); > e->probability = REG_BR_PROB_BASE; > @@ -1521,13 +1588,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc > LABEL_NUSES (label)++; > } > > - emit_barrier_after (BB_END (jump_block)); > + /* We might be in cfg layout mode, and if so, the following routine will > + insert the barrier correctly. */ > + emit_barrier_after_bb (jump_block); > redirect_edge_succ_nodup (e, target); > > if (abnormal_edge_flags) > make_edge (src, target, abnormal_edge_flags); > > df_mark_solutions_dirty (); > + fixup_partition_crossing (e, target); > return new_bb; > } > > @@ -1626,7 +1696,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU > static basic_block > rtl_split_edge (edge edge_in) > { > - basic_block bb; > + basic_block bb, new_bb; > rtx before; > > /* Abnormal edges cannot be split. */ > @@ -1659,12 +1729,26 @@ rtl_split_edge (edge edge_in) > else > { > bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); > - /* ??? Why not edge_in->dest->prev_bb here? */ > - BB_COPY_PARTITION (bb, edge_in->dest); > + if (edge_in->src == ENTRY_BLOCK_PTR) > + BB_COPY_PARTITION (bb, edge_in->dest); > + else > + /* Put the split bb into the src partition, to avoid creating > + a situation where a cold bb dominates a hot bb, in the case > + where src is cold and dest is hot. The src will dominate > + the new bb (whereas it might not have dominated dest). */ > + BB_COPY_PARTITION (bb, edge_in->src); > } > > make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); > > + /* Can't allow a region crossing edge to be fallthrough. */ > + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) > + && edge_in->dest != EXIT_BLOCK_PTR) > + { > + new_bb = force_nonfallthru (single_succ_edge (bb)); > + gcc_assert (!new_bb); > + } > + > /* For non-fallthru edges, we must adjust the predecessor's > jump instruction to target our new block. */ > if ((edge_in->flags & EDGE_FALLTHRU) == 0) > @@ -1777,17 +1861,13 @@ commit_one_edge_insertion (edge e) > else > { > bb = split_edge (e); > - after = BB_END (bb); > > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && e->src != ENTRY_BLOCK_PTR > - && BB_PARTITION (e->src) == BB_COLD_PARTITION > - && !(e->flags & EDGE_CROSSING) > - && JUMP_P (after) > - && !any_condjump_p (after) > - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) > - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); > + /* If e crossed a partition boundary, we needed to make bb end in > + a region-crossing jump, even though it was originally fallthru. */ > + if (JUMP_P (BB_END (bb))) > + before = BB_END (bb); > + else > + after = BB_END (bb); > } > > /* Now that we've found the spot, do the insertion. */ > @@ -1827,6 +1907,14 @@ commit_edge_insertions (void) > { > basic_block bb; > > + /* Optimization passes that invoke this routine can cause hot blocks > + previously reached by both hot and cold blocks to become dominated only > + by cold blocks. This will cause the verification below to fail, > + and lead to now cold code in the hot section. In some cases this > + may only be visible after newly unreachable blocks are deleted, > + which will be done by fixup_partitions. */ > + fixup_partitions (); > + > #ifdef ENABLE_CHECKING > verify_flow_info (); > #endif > @@ -2028,7 +2116,75 @@ get_last_bb_insn (basic_block bb) > > return end; > } > - > + > +/* Perform cleanup on the hot/cold bb partitioning after optimization > + passes that modify the cfg. */ > + > +void > +fixup_partitions (void) > +{ > + basic_block bb; > + > + if (!crtl->has_bb_partition) > + return; > + > + /* Delete any blocks that became unreachable and weren't > + already cleaned up, for example during edge forwarding > + and convert_jumps_to_returns. This will expose more > + opportunities for fixing the partition boundaries here. > + Also, the calculation of the dominance graph during verification > + will assert if there are unreachable nodes. */ > + delete_unreachable_blocks (); > + > + /* If there are partitions, do a sanity check on them: A basic block in > + a cold partition cannot dominate a basic block in a hot partition. > + Fixup any that now violate this requirement, as a result of edge > + forwarding and unreachable block deletion. */ > + VEC (basic_block, heap) *bbs_in_cold_partition = NULL; > + VEC (basic_block, heap) *bbs_to_fix = NULL; > + FOR_EACH_BB (bb) > + if ((BB_PARTITION (bb) == BB_COLD_PARTITION)) > + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb); > + if (! VEC_empty (basic_block, bbs_in_cold_partition)) > + { > + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); > + basic_block son; > + > + if (dom_calculated_here) > + calculate_dominance_info (CDI_DOMINATORS); > + > + while (! VEC_empty (basic_block, bbs_in_cold_partition)) > + { > + bb = VEC_pop (basic_block, bbs_in_cold_partition); > + /* If bb is not yet cold (because it was added below as > + a block dominated by a cold bb) then mark it cold here. */ > + if ((BB_PARTITION (bb) != BB_COLD_PARTITION)) > + { > + BB_SET_PARTITION (bb, BB_COLD_PARTITION); > + VEC_safe_push (basic_block, heap, bbs_to_fix, bb); > + } > + /* Any blocks dominated by a block in the cold section > + must also be cold. */ > + for (son = first_dom_son (CDI_DOMINATORS, bb); > + son; > + son = next_dom_son (CDI_DOMINATORS, son)) > + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son); > + } > + > + if (dom_calculated_here) > + free_dominance_info (CDI_DOMINATORS); > + } > + > + /* Do the partition fixup after all necessary blocks have been converted to > + cold, so that we only update the region crossings the minimum number of > + places, which can require forcing edges to be non fallthru. */ > + while (! VEC_empty (basic_block, bbs_to_fix)) > + { > + bb = VEC_pop (basic_block, bbs_to_fix); > + fixup_bb_partition (bb); > + } > +} > + > /* Verify the CFG and RTL consistency common for both underlying RTL and > cfglayout RTL. > > @@ -2052,6 +2208,7 @@ rtl_verify_flow_info_1 (void) > rtx x; > int err = 0; > basic_block bb; > + bool have_partitions = false; > > /* Check the general integrity of the basic blocks. */ > FOR_EACH_BB_REVERSE (bb) > @@ -2169,6 +2326,8 @@ rtl_verify_flow_info_1 (void) > > if (e->flags & EDGE_ABNORMAL) > n_abnormal++; > + > + have_partitions |= is_crossing; > } > > if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) > @@ -2293,6 +2452,40 @@ rtl_verify_flow_info_1 (void) > } > } > > + /* If there are partitions, do a sanity check on them: A basic block in > + a cold partition cannot dominate a basic block in a hot partition. */ > + VEC (basic_block, heap) *bbs_in_cold_partition = NULL; > + if (have_partitions && !err) > + FOR_EACH_BB (bb) > + if ((BB_PARTITION (bb) == BB_COLD_PARTITION)) > + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb); > + if (! VEC_empty (basic_block, bbs_in_cold_partition)) > + { > + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); > + basic_block son; > + > + if (dom_calculated_here) > + calculate_dominance_info (CDI_DOMINATORS); > + > + while (! VEC_empty (basic_block, bbs_in_cold_partition)) > + { > + bb = VEC_pop (basic_block, bbs_in_cold_partition); > + if ((BB_PARTITION (bb) != BB_COLD_PARTITION)) > + { > + error ("non-cold basic block %d dominated " > + "by a block in the cold partition", bb->index); > + err = 1; > + } > + for (son = first_dom_son (CDI_DOMINATORS, bb); > + son; > + son = next_dom_son (CDI_DOMINATORS, son)) > + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son); > + } > + > + if (dom_calculated_here) > + free_dominance_info (CDI_DOMINATORS); > + } > + > /* Clean up. */ > return err; > } > @@ -2965,14 +3158,41 @@ record_effective_endpoints (void) > else > cfg_layout_function_header = NULL_RTX; > > + had_sec_boundary_notes = false; > + > next_insn = get_insns (); > FOR_EACH_BB (bb) > { > rtx end; > > if (PREV_INSN (BB_HEAD (bb)) && next_insn != BB_HEAD (bb)) > - BB_HEADER (bb) = unlink_insn_chain (next_insn, > - PREV_INSN (BB_HEAD (bb))); > + { > + /* Rather than try to keep section boundary notes incrementally > + up-to-date through cfg layout optimizations, simply remove them > + and flag that they should be re-inserted when exiting > + cfg layout mode. */ > + rtx check_insn = next_insn; > + while (check_insn) > + { > + if (NOTE_P (check_insn) > + && NOTE_KIND (check_insn) == > NOTE_INSN_SWITCH_TEXT_SECTIONS) > + { > + had_sec_boundary_notes |= true; > + /* Remove note from chain. Grab new next_insn first. */ > + if (next_insn == check_insn) > + next_insn = NEXT_INSN (check_insn); > + /* Delete note. */ > + delete_insn (check_insn); > + /* There will only be one. */ > + break; > + } > + check_insn = NEXT_INSN (check_insn); > + } > + /* If we still have header instructions left after above loop. */ > + if (next_insn != BB_HEAD (bb)) > + BB_HEADER (bb) = unlink_insn_chain (next_insn, > + PREV_INSN (BB_HEAD (bb))); > + } > end = skip_insns_after_block (bb); > if (NEXT_INSN (BB_END (bb)) && BB_END (bb) != end) > BB_FOOTER (bb) = unlink_insn_chain (NEXT_INSN (BB_END (bb)), end); > @@ -3000,7 +3220,7 @@ outof_cfg_layout_mode (void) > if (bb->next_bb != EXIT_BLOCK_PTR) > bb->aux = bb->next_bb; > > - cfg_layout_finalize (); > + cfg_layout_finalize (false); > > return 0; > } > @@ -3120,10 +3340,13 @@ relink_block_chain (bool stay_in_cfglayout_mode) > } > > > -/* Given a reorder chain, rearrange the code to match. */ > +/* Given a reorder chain, rearrange the code to match. If > + this is called when we will FINALIZE_REORDER_BLOCKS, or when > + section boundary notes were removed on entry to cfg layout > + mode, insert section boundary notes here. */ > > static void > -fixup_reorder_chain (void) > +fixup_reorder_chain (bool finalize_reorder_blocks) > { > basic_block bb; > rtx insn = NULL; > @@ -3150,7 +3373,7 @@ static void > PREV_INSN (BB_HEADER (bb)) = insn; > insn = BB_HEADER (bb); > while (NEXT_INSN (insn)) > - insn = NEXT_INSN (insn); > + insn = NEXT_INSN (insn); > } > if (insn) > NEXT_INSN (insn) = BB_HEAD (bb); > @@ -3175,6 +3398,11 @@ static void > insn = NEXT_INSN (insn); > > set_last_insn (insn); > + > + /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ > + if (had_sec_boundary_notes || finalize_reorder_blocks) > + insert_section_boundary_note (); > + > #ifdef ENABLE_CHECKING > verify_insn_chain (); > #endif > @@ -3187,7 +3415,7 @@ static void > edge e_fall, e_taken, e; > rtx bb_end_insn; > rtx ret_label = NULL_RTX; > - basic_block nb, src_bb; > + basic_block nb; > edge_iterator ei; > > if (EDGE_COUNT (bb->succs) == 0) > @@ -3322,7 +3550,6 @@ static void > /* We got here if we need to add a new jump insn. > Note force_nonfallthru can delete E_FALL and thus we have to > save E_FALL->src prior to the call to force_nonfallthru. */ > - src_bb = e_fall->src; > nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); > if (nb) > { > @@ -3330,17 +3557,6 @@ static void > bb->aux = nb; > /* Don't process this new block. */ > bb = nb; > - > - /* Make sure new bb is tagged for correct section (same as > - fall-thru source, since you cannot fall-thru across > - section boundaries). */ > - BB_COPY_PARTITION (src_bb, single_pred (bb)); > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && JUMP_P (BB_END (bb)) > - && !any_condjump_p (BB_END (bb)) > - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) > - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); > } > } > > @@ -3644,10 +3860,11 @@ duplicate_insn_chain (rtx from, rtx to) > case NOTE_INSN_FUNCTION_BEG: > /* There is always just single entry to function. */ > case NOTE_INSN_BASIC_BLOCK: > + /* We should only switch text sections once. */ > + case NOTE_INSN_SWITCH_TEXT_SECTIONS: > break; > > case NOTE_INSN_EPILOGUE_BEG: > - case NOTE_INSN_SWITCH_TEXT_SECTIONS: > emit_note_copy (insn); > break; > > @@ -3759,10 +3976,13 @@ break_superblocks (void) > } > > /* Finalize the changes: reorder insn list according to the sequence > specified > - by aux pointers, enter compensation code, rebuild scope forest. */ > + by aux pointers, enter compensation code, rebuild scope forest. If > + this is called when we will FINALIZE_REORDER_BLOCKS, indicate that > + to fixup_reorder_chain so that it can insert the proper switch text > + section notes. */ > > void > -cfg_layout_finalize (void) > +cfg_layout_finalize (bool finalize_reorder_blocks) > { > #ifdef ENABLE_CHECKING > verify_flow_info (); > @@ -3775,7 +3995,7 @@ void > #endif > ) > fixup_fallthru_exit_predecessor (); > - fixup_reorder_chain (); > + fixup_reorder_chain (finalize_reorder_blocks); > > rebuild_jump_labels (get_insns ()); > delete_dead_jumptables (); > @@ -4454,8 +4674,7 @@ rtl_can_remove_branch_p (const_edge e) > if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) > return false; > > - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) > - || BB_PARTITION (src) != BB_PARTITION (target)) > + if (BB_PARTITION (src) != BB_PARTITION (target)) > return false; > > if (!onlyjump_p (insn) > > -- > This patch is available for review at http://codereview.appspot.com/6823047
-- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413