Hi, I have tested your patch on Spec2000 on ARM, and I can still see several failures caused by: "error: fallthru edge crosses section boundary", including the case described in PR55121.
On 26 November 2012 16:55, Teresa Johnson <tejohn...@google.com> wrote: > Ping. > Teresa > > On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson <tejohn...@google.com> wrote: >> Revised patch that fixes failures encountered when enabling >> -freorder-blocks-and-partition, including the failure reported in PR 53743. >> >> This includes new verification code to ensure no cold blocks dominate hot >> blocks contributed by Steven Bosscher. >> >> I attempted to make the handling of partition updates through the >> optimization >> passes much more consistent, removing a number of partial fixes in the code >> stream in the process. The code to fixup partitions (including the >> BB_PARTITION >> assignement, region crossing jump notes, and switch text section notes) is >> now handled in a few centralized locations. For example, inside >> rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that >> callers >> don't need to attempt the fixup themselves. >> >> For optimization passes that make adjustments to the cfg while in cfg layout >> mode that are not easy to fix up incrementally, the new routine >> fixup_partitions handles the cleanup globally. This does require calculation >> of the dominance relation, however, as far as I can tell the routines which >> now invoke this global fixup (try_optimize_cfg and commit_edge_insertions) >> are invoked typically once (or a small number of times in the case of >> try_optimize_cfg) per optimization pass. Additionally, I compared the >> -ftime-report output for some large fdo compilations and saw only minimal >> increases in the dominance computation times, which were only a tiny percent >> of the overall compile time. >> >> Additionally, I added a flag to the rtl_data structure to indicate whether >> any partitioning was actually performed, so that optimizations which were >> conservatively disabled whenever the flag_reorder_blocks_and_partition >> is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less >> conservative for functions where no partitions were formed (e.g. they are >> completely hot). >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with >> SPEC2006 int >> benchmarks and internal google benchmarks using profile feedback and >> -freorder-blocks-and-partition to get more coverage. Ok for trunk? >> >> Thanks, >> Teresa >> >> 2012-11-14 Teresa Johnson <tejohn...@google.com> >> Steven Bosscher <ste...@gcc.gnu.org> >> >> * cfghooks.h (cfg_layout_finalize): New parameter. >> * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize >> parameter. >> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert >> as this is now done by redirect_edge_and_branch_force. >> * function.c (thread_prologue_and_epilogue_insns): Insert new bb >> after >> barriers, new cfg_layout_finalize parameter, and don't store exit >> predecessor BB until after it is potentially split. >> * function.h (struct rtl_data): New flag has_bb_partition. >> * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter. >> * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if >> any blocks in function actually partitioned. >> (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean >> up partitioning. >> * bb-reorder.c (connect_traces): Only look for partitions and skip >> block copying if any blocks in function actually partitioned. >> (emit_barrier_after_bb): Handle insertion in non-cfglayout mode. >> (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure >> that no cold blocks dominate a hot block. >> (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert >> as this is now done by force_nonfallthru_and_redirect. >> (add_reg_crossing_jump_notes): Handle the fact that some jumps may >> already be marked with region crossing note. >> (reorder_basic_blocks): Only need to verify partitions if any >> blocks in function actually partitioned. >> (insert_section_boundary_note): Only need to insert note if any >> blocks in function actually partitioned. >> (rest_of_handle_reorder_blocks): New cfg_layout_finalize >> parameter, and remove call to insert_section_boundary_note as this >> is now called via cfg_layout_finalize/fixup_reorder_chain. >> (duplicate_computed_gotos): New cfg_layout_finalize >> parameter. >> (partition_hot_cold_basic_blocks): Set flag indicating function >> has bb partitions. >> * bb-reorder.h: Declare insert_section_boundary_note and >> emit_barrier_after_bb, which are no longer static. >> * basic-block.h: Declare new function fixup_partitions. >> * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary >> check for region crossing note. >> (fixup_partition_crossing): New function. >> (fixup_bb_partition): Ditto. >> (rtl_redirect_edge_and_branch): Fixup partition boundaries. >> (force_nonfallthru_and_redirect): Fixup partition boundaries, >> remove old code that tried to do this. Emit barrier correctly >> when we are in cfglayout mode. >> (rtl_split_edge): Correctly fixup partition boundaries. >> (commit_one_edge_insertion): Remove old code that tried to >> fixup region crossing edge since this is now handled in >> split_block, and set up insertion point correctly since >> block may now end in a jump. >> (commit_edge_insertions): Invoke fixup_partitions to sanitize >> partition >> boundaries after optimizations that modify cfg and before trying to >> verify the flow info. >> (fixup_partitions): New function. >> (rtl_verify_flow_info_1): Add verification that no cold bbs dominate >> hot bbs. >> (record_effective_endpoints): Remove region-crossing notes and set >> flag >> indicating that they need to be reinserted on exit from cfglayout >> mode. >> (outof_cfg_layout_mode): New cfg_layout_finalize parameter. >> (fixup_reorder_chain): Call insert_section_boundary_note if >> necessary. >> Remove old code that attempted to fixup region crossing note as >> this is now handled in force_nonfallthru_and_redirect. >> (duplicate_insn_chain): Don't duplicate switch section notes. >> (cfg_layout_finalize): Pass new parameter to fixup_reorder_chain. >> (rtl_can_remove_branch_p): Remove unnecessary check for region >> crossing >> note. >> >> Index: cfghooks.h >> =================================================================== >> --- cfghooks.h (revision 193376) >> +++ cfghooks.h (working copy) >> @@ -204,7 +204,7 @@ extern void copy_bbs (basic_block *, unsigned, bas >> void account_profile_record (struct profile_record *, int); >> >> extern void cfg_layout_initialize (unsigned int); >> -extern void cfg_layout_finalize (void); >> +extern void cfg_layout_finalize (bool); >> >> /* Hooks containers. */ >> extern struct cfg_hooks gimple_cfg_hooks; >> @@ -218,4 +218,3 @@ extern void cfg_layout_rtl_register_cfg_hooks (voi >> extern void gimple_register_cfg_hooks (void); >> extern struct cfg_hooks get_cfg_hooks (void); >> extern void set_cfg_hooks (struct cfg_hooks); >> - >> Index: modulo-sched.c >> =================================================================== >> --- modulo-sched.c (revision 193376) >> +++ modulo-sched.c (working copy) >> @@ -3354,7 +3354,7 @@ rest_of_handle_sms (void) >> if (bb->next_bb != EXIT_BLOCK_PTR) >> bb->aux = bb->next_bb; >> free_dominance_info (CDI_DOMINATORS); >> - cfg_layout_finalize (); >> + cfg_layout_finalize (false); >> #endif /* INSN_SCHEDULING */ >> return 0; >> } >> Index: ifcvt.c >> =================================================================== >> --- ifcvt.c (revision 193376) >> +++ ifcvt.c (working copy) >> @@ -3900,10 +3900,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg >> if (new_bb) >> { >> df_bb_replace (then_bb_index, new_bb); >> - /* Since the fallthru edge was redirected from test_bb to new_bb, >> - we need to ensure that new_bb is in the same partition as >> - test bb (you can not fall through across section boundaries). */ >> - BB_COPY_PARTITION (new_bb, test_bb); >> + /* This should have been done above via force_nonfallthru_and_redirect >> + (possibly called from redirect_edge_and_branch_force). */ >> + gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); >> } >> >> num_true_changes++; >> Index: function.c >> =================================================================== >> --- function.c (revision 193376) >> +++ function.c (working copy) >> @@ -6249,8 +6249,10 @@ thread_prologue_and_epilogue_insns (void) >> break; >> if (e) >> { >> - copy_bb = create_basic_block (NEXT_INSN (BB_END >> (e->src)), >> - NULL_RTX, e->src); >> + /* Make sure we insert after any barriers. */ >> + rtx end = get_last_bb_insn (e->src); >> + copy_bb = create_basic_block (NEXT_INSN (end), >> + NULL_RTX, e->src); >> BB_COPY_PARTITION (copy_bb, e->src); >> } >> else >> @@ -6475,7 +6477,7 @@ thread_prologue_and_epilogue_insns (void) >> if (cur_bb->index >= NUM_FIXED_BLOCKS >> && cur_bb->next_bb->index >= NUM_FIXED_BLOCKS) >> cur_bb->aux = cur_bb->next_bb; >> - cfg_layout_finalize (); >> + cfg_layout_finalize (false); >> } >> >> epilogue_done: >> @@ -6517,7 +6519,7 @@ epilogue_done: >> basic_block simple_return_block_cold = NULL; >> edge pending_edge_hot = NULL; >> edge pending_edge_cold = NULL; >> - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; >> + basic_block exit_pred; >> int i; >> >> gcc_assert (entry_edge != orig_entry_edge); >> @@ -6545,6 +6547,12 @@ epilogue_done: >> else >> pending_edge_cold = e; >> } >> + >> + /* Save a pointer to the exit's predecessor BB for use in >> + inserting new BBs at the end of the function. Do this >> + after the call to split_block above which may split >> + the original exit pred. */ >> + exit_pred = EXIT_BLOCK_PTR->prev_bb; >> >> FOR_EACH_VEC_ELT (edge, unconverted_simple_returns, i, e) >> { >> Index: function.h >> =================================================================== >> --- function.h (revision 193376) >> +++ function.h (working copy) >> @@ -459,6 +459,11 @@ struct GTY(()) rtl_data { >> sched2) and is useful only if the port defines LEAF_REGISTERS. */ >> bool uses_only_leaf_regs; >> >> + /* Nonzero if the function being compiled has undergone hot/cold >> partitioning >> + (under flag_reorder_blocks_and_partition) and has at least one cold >> + block. */ >> + bool has_bb_partition; >> + >> /* Like regs_ever_live, but 1 if a reg is set or clobbered from an >> asm. Unlike regs_ever_live, elements of this array corresponding >> to eliminable regs (like the frame pointer) are set if an asm >> Index: hw-doloop.c >> =================================================================== >> --- hw-doloop.c (revision 193376) >> +++ hw-doloop.c (working copy) >> @@ -547,7 +547,7 @@ reorder_loops (hwloop_info loops) >> else >> bb->aux = NULL; >> } >> - cfg_layout_finalize (); >> + cfg_layout_finalize (false); >> clear_aux_for_blocks (); >> df_analyze (); >> } >> Index: cfgcleanup.c >> =================================================================== >> --- cfgcleanup.c (revision 193376) >> +++ cfgcleanup.c (working copy) >> @@ -1824,7 +1824,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2, >> partition boundaries). See the comments at the top of >> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ >> >> - if (flag_reorder_blocks_and_partition && reload_completed) >> + if (crtl->has_bb_partition && reload_completed) >> return false; >> >> /* Search backward through forwarder blocks. We don't need to worry >> @@ -2767,10 +2767,21 @@ try_optimize_cfg (int mode) >> df_analyze (); >> } >> >> + if (changed) >> + { >> + /* Edge forwarding in particular can cause hot blocks >> previously >> + reached by both hot and cold blocks to become dominated >> only >> + by cold blocks. This will cause the verification below to >> fail, >> + and lead to now cold code in the hot section. This is not >> easy >> + to detect and fix during edge forwarding, and in some cases >> + is only visible after newly unreachable blocks are deleted, >> + which will be done in fixup_partitions. */ >> + fixup_partitions (); >> + >> #ifdef ENABLE_CHECKING >> - if (changed) >> - verify_flow_info (); >> + verify_flow_info (); >> #endif >> + } >> >> changed_overall |= changed; >> first_pass = false; >> Index: bb-reorder.c >> =================================================================== >> --- bb-reorder.c (revision 193376) >> +++ bb-reorder.c (working copy) >> @@ -1054,7 +1054,7 @@ connect_traces (int n_traces, struct trace *traces >> current_partition = BB_PARTITION (traces[0].first); >> two_passes = false; >> >> - if (flag_reorder_blocks_and_partition) >> + if (crtl->has_bb_partition) >> for (i = 0; i < n_traces && !two_passes; i++) >> if (BB_PARTITION (traces[0].first) >> != BB_PARTITION (traces[i].first)) >> @@ -1263,7 +1263,7 @@ connect_traces (int n_traces, struct trace *traces >> } >> } >> >> - if (flag_reorder_blocks_and_partition) >> + if (crtl->has_bb_partition) >> try_copy = false; >> >> /* Copy tiny blocks always; copy larger blocks only when the >> @@ -1381,13 +1381,14 @@ get_uncond_jump_length (void) >> return length; >> } >> >> -/* Emit a barrier into the footer of BB. */ >> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ >> >> -static void >> +void >> emit_barrier_after_bb (basic_block bb) >> { >> rtx barrier = emit_barrier_after (BB_END (bb)); >> - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >> + if (current_ir_type () == IR_RTL_CFGLAYOUT) >> + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >> } >> >> /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. >> @@ -1463,18 +1464,109 @@ find_rarely_executed_basic_blocks_and_crossing_edg >> { >> VEC(edge, heap) *crossing_edges = NULL; >> basic_block bb; >> - edge e; >> - edge_iterator ei; >> + edge e, e2; >> + edge_iterator ei, ei2; >> + unsigned int cold_bb_count = 0; >> + VEC (basic_block, heap) *bbs_in_hot_partition = NULL; >> + VEC (basic_block, heap) *bbs_newly_hot = NULL; >> >> /* Mark which partition (hot/cold) each basic block belongs in. */ >> FOR_EACH_BB (bb) >> { >> if (probably_never_executed_bb_p (cfun, bb)) >> - BB_SET_PARTITION (bb, BB_COLD_PARTITION); >> + { >> + BB_SET_PARTITION (bb, BB_COLD_PARTITION); >> + cold_bb_count++; >> + } >> else >> - BB_SET_PARTITION (bb, BB_HOT_PARTITION); >> + { >> + BB_SET_PARTITION (bb, BB_HOT_PARTITION); >> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, bb); >> + } >> } >> >> + /* Ensure that no cold bbs dominate hot bbs. This could happen as a >> result of >> + several different possibilities. One is that there are edge weight >> insanities >> + due to optimization phases that do not properly update basic block >> profile >> + counts. The second is that the entry of the function may not be hot, >> because >> + it is entered fewer times than the number of profile training runs, >> but there >> + is a loop inside the function that causes blocks within the function >> to be >> + above the threshold for hotness. */ >> + if (cold_bb_count) >> + { >> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); >> + >> + if (dom_calculated_here) >> + calculate_dominance_info (CDI_DOMINATORS); >> + >> + /* Keep examining hot bbs until we have either checked them all, or >> + re-marked all cold bbs hot. */ >> + while (! VEC_empty (basic_block, bbs_in_hot_partition) >> + && cold_bb_count) >> + { >> + basic_block dom_bb; >> + >> + bb = VEC_pop (basic_block, bbs_in_hot_partition); >> + dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb); >> + >> + /* If bb's immediate dominator is also hot then it is ok. */ >> + if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION) >> + continue; >> + >> + /* We have a hot bb with an immediate dominator that is cold. >> + The dominator needs to be re-marked to hot. */ >> + BB_SET_PARTITION (dom_bb, BB_HOT_PARTITION); >> + cold_bb_count--; >> + >> + /* Now we need to examine newly-hot dom_bb to see if it is also >> + dominated by a cold bb. */ >> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, dom_bb); >> + >> + /* We should also adjust any cold blocks that the newly-hot bb >> + feeds and see if it makes sense to re-mark those as hot as >> + well. */ >> + VEC_safe_push (basic_block, heap, bbs_newly_hot, dom_bb); >> + while (! VEC_empty (basic_block, bbs_newly_hot)) >> + { >> + basic_block new_hot_bb = VEC_pop (basic_block, bbs_newly_hot); >> + /* Examine all successors of this newly-hot bb to see if they >> + are cold and should be re-marked as hot. */ >> + FOR_EACH_EDGE (e, ei, new_hot_bb->succs) >> + { >> + bool any_cold_preds = false; >> + basic_block succ = e->dest; >> + if (BB_PARTITION (succ) != BB_COLD_PARTITION) >> + continue; >> + /* Does this block have any cold predecessors now? */ >> + FOR_EACH_EDGE (e2, ei2, succ->preds) >> + { >> + if (BB_PARTITION (e2->src) == BB_COLD_PARTITION) >> + { >> + any_cold_preds = true; >> + break; >> + } >> + } >> + if (any_cold_preds) >> + continue; >> + >> + /* Here we have a successor of newly-hot bb that is cold >> + but no longer has any cold precessessors. Since the >> original >> + assignment of our newly-hot bb was incorrect, this >> successor's >> + assignment as cold is also suspect. Go ahead and >> re-mark it >> + as hot now too. Better heuristics may be in order >> here. */ >> + BB_SET_PARTITION (succ, BB_HOT_PARTITION); >> + cold_bb_count--; >> + VEC_safe_push (basic_block, heap, bbs_in_hot_partition, >> succ); >> + /* Examine this successor as a newly-hot bb. */ >> + VEC_safe_push (basic_block, heap, bbs_newly_hot, succ); >> + } >> + } >> + } >> + >> + if (dom_calculated_here) >> + free_dominance_info (CDI_DOMINATORS); >> + } >> + >> /* The format of .gcc_except_table does not allow landing pads to >> be in a different partition as the throw. Fix this by either >> moving or duplicating the landing pads. */ >> @@ -1766,10 +1858,10 @@ fix_up_fall_thru_edges (void) >> new_bb->aux = cur_bb->aux; >> cur_bb->aux = new_bb; >> >> - /* Make sure new fall-through bb is in same >> - partition as bb it's falling through from. */ >> + /* This is done by force_nonfallthru_and_redirect. */ >> + gcc_assert (BB_PARTITION (new_bb) >> + == BB_PARTITION (cur_bb)); >> >> - BB_COPY_PARTITION (new_bb, cur_bb); >> single_succ_edge (new_bb)->flags |= EDGE_CROSSING; >> } >> else >> @@ -2067,7 +2159,10 @@ add_reg_crossing_jump_notes (void) >> FOR_EACH_BB (bb) >> FOR_EACH_EDGE (e, ei, bb->succs) >> if ((e->flags & EDGE_CROSSING) >> - && JUMP_P (BB_END (e->src))) >> + && JUMP_P (BB_END (e->src)) >> + /* Some notes were added during fix_up_fall_thru_edges, via >> + force_nonfallthru_and_redirect. */ >> + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) >> add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> } >> >> @@ -2160,7 +2255,7 @@ reorder_basic_blocks (void) >> dump_flow_info (dump_file, dump_flags); >> } >> >> - if (flag_reorder_blocks_and_partition) >> + if (crtl->has_bb_partition) >> verify_hot_cold_block_grouping (); >> } >> >> @@ -2172,14 +2267,14 @@ reorder_basic_blocks (void) >> encountering this note will make the compiler switch between the >> hot and cold text sections. */ >> >> -static void >> +void >> insert_section_boundary_note (void) >> { >> basic_block bb; >> rtx new_note; >> int first_partition = 0; >> >> - if (!flag_reorder_blocks_and_partition) >> + if (!crtl->has_bb_partition) >> return; >> >> FOR_EACH_BB (bb) >> @@ -2222,10 +2317,8 @@ rest_of_handle_reorder_blocks (void) >> FOR_EACH_BB (bb) >> if (bb->next_bb != EXIT_BLOCK_PTR) >> bb->aux = bb->next_bb; >> - cfg_layout_finalize (); >> + cfg_layout_finalize (true); >> >> - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ >> - insert_section_boundary_note (); >> return 0; >> } >> >> @@ -2366,7 +2459,7 @@ duplicate_computed_gotos (void) >> } >> >> done: >> - cfg_layout_finalize (); >> + cfg_layout_finalize (false); >> >> BITMAP_FREE (candidates); >> return 0; >> @@ -2511,6 +2604,8 @@ partition_hot_cold_basic_blocks (void) >> if (crossing_edges == NULL) >> return 0; >> >> + crtl->has_bb_partition = true; >> + >> /* Make sure the source of any crossing edge ends in a jump and the >> destination of any crossing edge has a label. */ >> add_labels_and_missing_jumps (crossing_edges); >> Index: bb-reorder.h >> =================================================================== >> --- bb-reorder.h (revision 193376) >> +++ bb-reorder.h (working copy) >> @@ -36,4 +36,8 @@ extern struct target_bb_reorder *this_target_bb_re >> >> extern int get_uncond_jump_length (void); >> >> +extern void insert_section_boundary_note (void); >> + >> +extern void emit_barrier_after_bb (basic_block bb); >> + >> #endif >> Index: basic-block.h >> =================================================================== >> --- basic-block.h (revision 193376) >> +++ basic-block.h (working copy) >> @@ -806,6 +806,7 @@ extern basic_block force_nonfallthru_and_redirect >> extern bool contains_no_active_insn_p (const_basic_block); >> extern bool forwarder_block_p (const_basic_block); >> extern bool can_fallthru (basic_block, basic_block); >> +extern void fixup_partitions (void); >> >> /* In cfgbuild.c. */ >> extern void find_many_sub_basic_blocks (sbitmap); >> Index: cfgrtl.c >> =================================================================== >> --- cfgrtl.c (revision 193376) >> +++ cfgrtl.c (working copy) >> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see >> #include "tree.h" >> #include "hard-reg-set.h" >> #include "basic-block.h" >> +#include "bb-reorder.h" >> #include "regs.h" >> #include "flags.h" >> #include "function.h" >> @@ -67,11 +68,12 @@ along with GCC; see the file COPYING3. If not see >> Only applicable if the CFG is in cfglayout mode. */ >> static GTY(()) rtx cfg_layout_function_footer; >> static GTY(()) rtx cfg_layout_function_header; >> +static bool had_sec_boundary_notes; >> >> static rtx skip_insns_after_block (basic_block); >> static void record_effective_endpoints (void); >> static rtx label_for_bb (basic_block); >> -static void fixup_reorder_chain (void); >> +static void fixup_reorder_chain (bool finalize_reorder_blocks); >> >> void verify_insn_chain (void); >> static void fixup_fallthru_exit_predecessor (void); >> @@ -976,8 +978,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc >> partition boundaries). See the comments at the top of >> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ >> >> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >> - || BB_PARTITION (src) != BB_PARTITION (target)) >> + if (BB_PARTITION (src) != BB_PARTITION (target)) >> return NULL; >> >> /* We can replace or remove a complex jump only when we have exactly >> @@ -1286,6 +1287,71 @@ redirect_branch_edge (edge e, basic_block target) >> return e; >> } >> >> +/* Called when edge E has been redirected to a new destination, >> + in order to update the region crossing flag on the edge and >> + jump. */ >> + >> +static void >> +fixup_partition_crossing (edge e, basic_block target) >> +{ >> + rtx note; >> + >> + gcc_assert (e->dest == target); >> + >> + if (e->src == ENTRY_BLOCK_PTR || target == EXIT_BLOCK_PTR) >> + return; >> + /* If we redirected an existing edge, it may already be marked >> + crossing, even though the new src is missing a reg crossing note. >> + But make sure reg crossing note doesn't already exist before >> + inserting. */ >> + if (BB_PARTITION (e->src) != BB_PARTITION (target)) >> + { >> + e->flags |= EDGE_CROSSING; >> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + if (JUMP_P (BB_END (e->src)) >> + && !note) >> + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + } >> + else if (BB_PARTITION (e->src) == BB_PARTITION (target)) >> + { >> + e->flags &= ~EDGE_CROSSING; >> + /* Remove the region crossing note from jump at end of >> + e->src if it exists. */ >> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + if (note) >> + remove_note (BB_END (e->src), note); >> + } >> +} >> + >> +/* Called when block BB has been reassigned to a different partition, >> + to ensure that the region crossing attributes are updated. */ >> + >> +static void >> +fixup_bb_partition (basic_block bb) >> +{ >> + edge e; >> + edge_iterator ei; >> + >> + /* Now need to make bb's pred edges non-region crossing. */ >> + FOR_EACH_EDGE (e, ei, bb->preds) >> + { >> + fixup_partition_crossing (e, e->dest); >> + } >> + >> + /* Possibly need to make bb's successor edges region crossing, >> + or remove stale region crossing. */ >> + FOR_EACH_EDGE (e, ei, bb->succs) >> + { >> + if ((e->flags & EDGE_FALLTHRU) >> + && BB_PARTITION (bb) != BB_PARTITION (e->dest) >> + && e->dest != EXIT_BLOCK_PTR) >> + /* force_nonfallthru_and_redirect calls fixup_partition_crossing. >> */ >> + force_nonfallthru (e); >> + else >> + fixup_partition_crossing (e, e->dest); >> + } >> +} >> + >> /* Attempt to change code to redirect edge E to TARGET. Don't do that on >> expense of adding new instructions or reordering basic blocks. >> >> @@ -1302,16 +1368,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block >> { >> edge ret; >> basic_block src = e->src; >> + basic_block dest = e->dest; >> >> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >> return NULL; >> >> - if (e->dest == target) >> + if (dest == target) >> return e; >> >> if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) >> { >> df_set_bb_dirty (src); >> + fixup_partition_crossing (ret, target); >> return ret; >> } >> >> @@ -1320,6 +1388,7 @@ rtl_redirect_edge_and_branch (edge e, basic_block >> return NULL; >> >> df_set_bb_dirty (src); >> + fixup_partition_crossing (ret, target); >> return ret; >> } >> >> @@ -1454,18 +1523,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >> /* Make sure new block ends up in correct hot/cold section. */ >> >> BB_COPY_PARTITION (jump_block, e->src); >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && JUMP_P (BB_END (jump_block)) >> - && !any_condjump_p (BB_END (jump_block)) >> - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) >> - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); >> >> /* Wire edge in. */ >> new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); >> new_edge->probability = probability; >> new_edge->count = count; >> >> + /* If e->src was previously region crossing, it no longer is >> + and the reg crossing note should be removed. */ >> + fixup_partition_crossing (new_edge, jump_block); >> + >> /* Redirect old edge. */ >> redirect_edge_pred (e, jump_block); >> e->probability = REG_BR_PROB_BASE; >> @@ -1521,13 +1588,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >> LABEL_NUSES (label)++; >> } >> >> - emit_barrier_after (BB_END (jump_block)); >> + /* We might be in cfg layout mode, and if so, the following routine will >> + insert the barrier correctly. */ >> + emit_barrier_after_bb (jump_block); >> redirect_edge_succ_nodup (e, target); >> >> if (abnormal_edge_flags) >> make_edge (src, target, abnormal_edge_flags); >> >> df_mark_solutions_dirty (); >> + fixup_partition_crossing (e, target); >> return new_bb; >> } >> >> @@ -1626,7 +1696,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU >> static basic_block >> rtl_split_edge (edge edge_in) >> { >> - basic_block bb; >> + basic_block bb, new_bb; >> rtx before; >> >> /* Abnormal edges cannot be split. */ >> @@ -1659,12 +1729,26 @@ rtl_split_edge (edge edge_in) >> else >> { >> bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); >> - /* ??? Why not edge_in->dest->prev_bb here? */ >> - BB_COPY_PARTITION (bb, edge_in->dest); >> + if (edge_in->src == ENTRY_BLOCK_PTR) >> + BB_COPY_PARTITION (bb, edge_in->dest); >> + else >> + /* Put the split bb into the src partition, to avoid creating >> + a situation where a cold bb dominates a hot bb, in the case >> + where src is cold and dest is hot. The src will dominate >> + the new bb (whereas it might not have dominated dest). */ >> + BB_COPY_PARTITION (bb, edge_in->src); >> } >> >> make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); >> >> + /* Can't allow a region crossing edge to be fallthrough. */ >> + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) >> + && edge_in->dest != EXIT_BLOCK_PTR) >> + { >> + new_bb = force_nonfallthru (single_succ_edge (bb)); >> + gcc_assert (!new_bb); >> + } >> + >> /* For non-fallthru edges, we must adjust the predecessor's >> jump instruction to target our new block. */ >> if ((edge_in->flags & EDGE_FALLTHRU) == 0) >> @@ -1777,17 +1861,13 @@ commit_one_edge_insertion (edge e) >> else >> { >> bb = split_edge (e); >> - after = BB_END (bb); >> >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && e->src != ENTRY_BLOCK_PTR >> - && BB_PARTITION (e->src) == BB_COLD_PARTITION >> - && !(e->flags & EDGE_CROSSING) >> - && JUMP_P (after) >> - && !any_condjump_p (after) >> - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) >> - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); >> + /* If e crossed a partition boundary, we needed to make bb end in >> + a region-crossing jump, even though it was originally fallthru. */ >> + if (JUMP_P (BB_END (bb))) >> + before = BB_END (bb); >> + else >> + after = BB_END (bb); >> } >> >> /* Now that we've found the spot, do the insertion. */ >> @@ -1827,6 +1907,14 @@ commit_edge_insertions (void) >> { >> basic_block bb; >> >> + /* Optimization passes that invoke this routine can cause hot blocks >> + previously reached by both hot and cold blocks to become dominated only >> + by cold blocks. This will cause the verification below to fail, >> + and lead to now cold code in the hot section. In some cases this >> + may only be visible after newly unreachable blocks are deleted, >> + which will be done by fixup_partitions. */ >> + fixup_partitions (); >> + >> #ifdef ENABLE_CHECKING >> verify_flow_info (); >> #endif >> @@ -2028,7 +2116,75 @@ get_last_bb_insn (basic_block bb) >> >> return end; >> } >> - >> + >> +/* Perform cleanup on the hot/cold bb partitioning after optimization >> + passes that modify the cfg. */ >> + >> +void >> +fixup_partitions (void) >> +{ >> + basic_block bb; >> + >> + if (!crtl->has_bb_partition) >> + return; >> + >> + /* Delete any blocks that became unreachable and weren't >> + already cleaned up, for example during edge forwarding >> + and convert_jumps_to_returns. This will expose more >> + opportunities for fixing the partition boundaries here. >> + Also, the calculation of the dominance graph during verification >> + will assert if there are unreachable nodes. */ >> + delete_unreachable_blocks (); >> + >> + /* If there are partitions, do a sanity check on them: A basic block in >> + a cold partition cannot dominate a basic block in a hot partition. >> + Fixup any that now violate this requirement, as a result of edge >> + forwarding and unreachable block deletion. */ >> + VEC (basic_block, heap) *bbs_in_cold_partition = NULL; >> + VEC (basic_block, heap) *bbs_to_fix = NULL; >> + FOR_EACH_BB (bb) >> + if ((BB_PARTITION (bb) == BB_COLD_PARTITION)) >> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb); >> + if (! VEC_empty (basic_block, bbs_in_cold_partition)) >> + { >> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); >> + basic_block son; >> + >> + if (dom_calculated_here) >> + calculate_dominance_info (CDI_DOMINATORS); >> + >> + while (! VEC_empty (basic_block, bbs_in_cold_partition)) >> + { >> + bb = VEC_pop (basic_block, bbs_in_cold_partition); >> + /* If bb is not yet cold (because it was added below as >> + a block dominated by a cold bb) then mark it cold here. */ >> + if ((BB_PARTITION (bb) != BB_COLD_PARTITION)) >> + { >> + BB_SET_PARTITION (bb, BB_COLD_PARTITION); >> + VEC_safe_push (basic_block, heap, bbs_to_fix, bb); >> + } >> + /* Any blocks dominated by a block in the cold section >> + must also be cold. */ >> + for (son = first_dom_son (CDI_DOMINATORS, bb); >> + son; >> + son = next_dom_son (CDI_DOMINATORS, son)) >> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son); >> + } >> + >> + if (dom_calculated_here) >> + free_dominance_info (CDI_DOMINATORS); >> + } >> + >> + /* Do the partition fixup after all necessary blocks have been converted >> to >> + cold, so that we only update the region crossings the minimum number of >> + places, which can require forcing edges to be non fallthru. */ >> + while (! VEC_empty (basic_block, bbs_to_fix)) >> + { >> + bb = VEC_pop (basic_block, bbs_to_fix); >> + fixup_bb_partition (bb); >> + } >> +} >> + >> /* Verify the CFG and RTL consistency common for both underlying RTL and >> cfglayout RTL. >> >> @@ -2052,6 +2208,7 @@ rtl_verify_flow_info_1 (void) >> rtx x; >> int err = 0; >> basic_block bb; >> + bool have_partitions = false; >> >> /* Check the general integrity of the basic blocks. */ >> FOR_EACH_BB_REVERSE (bb) >> @@ -2169,6 +2326,8 @@ rtl_verify_flow_info_1 (void) >> >> if (e->flags & EDGE_ABNORMAL) >> n_abnormal++; >> + >> + have_partitions |= is_crossing; >> } >> >> if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) >> @@ -2293,6 +2452,40 @@ rtl_verify_flow_info_1 (void) >> } >> } >> >> + /* If there are partitions, do a sanity check on them: A basic block in >> + a cold partition cannot dominate a basic block in a hot partition. */ >> + VEC (basic_block, heap) *bbs_in_cold_partition = NULL; >> + if (have_partitions && !err) >> + FOR_EACH_BB (bb) >> + if ((BB_PARTITION (bb) == BB_COLD_PARTITION)) >> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb); >> + if (! VEC_empty (basic_block, bbs_in_cold_partition)) >> + { >> + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); >> + basic_block son; >> + >> + if (dom_calculated_here) >> + calculate_dominance_info (CDI_DOMINATORS); >> + >> + while (! VEC_empty (basic_block, bbs_in_cold_partition)) >> + { >> + bb = VEC_pop (basic_block, bbs_in_cold_partition); >> + if ((BB_PARTITION (bb) != BB_COLD_PARTITION)) >> + { >> + error ("non-cold basic block %d dominated " >> + "by a block in the cold partition", bb->index); >> + err = 1; >> + } >> + for (son = first_dom_son (CDI_DOMINATORS, bb); >> + son; >> + son = next_dom_son (CDI_DOMINATORS, son)) >> + VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son); >> + } >> + >> + if (dom_calculated_here) >> + free_dominance_info (CDI_DOMINATORS); >> + } >> + >> /* Clean up. */ >> return err; >> } >> @@ -2965,14 +3158,41 @@ record_effective_endpoints (void) >> else >> cfg_layout_function_header = NULL_RTX; >> >> + had_sec_boundary_notes = false; >> + >> next_insn = get_insns (); >> FOR_EACH_BB (bb) >> { >> rtx end; >> >> if (PREV_INSN (BB_HEAD (bb)) && next_insn != BB_HEAD (bb)) >> - BB_HEADER (bb) = unlink_insn_chain (next_insn, >> - PREV_INSN (BB_HEAD (bb))); >> + { >> + /* Rather than try to keep section boundary notes incrementally >> + up-to-date through cfg layout optimizations, simply remove them >> + and flag that they should be re-inserted when exiting >> + cfg layout mode. */ >> + rtx check_insn = next_insn; >> + while (check_insn) >> + { >> + if (NOTE_P (check_insn) >> + && NOTE_KIND (check_insn) == >> NOTE_INSN_SWITCH_TEXT_SECTIONS) >> + { >> + had_sec_boundary_notes |= true; >> + /* Remove note from chain. Grab new next_insn first. */ >> + if (next_insn == check_insn) >> + next_insn = NEXT_INSN (check_insn); >> + /* Delete note. */ >> + delete_insn (check_insn); >> + /* There will only be one. */ >> + break; >> + } >> + check_insn = NEXT_INSN (check_insn); >> + } >> + /* If we still have header instructions left after above loop. */ >> + if (next_insn != BB_HEAD (bb)) >> + BB_HEADER (bb) = unlink_insn_chain (next_insn, >> + PREV_INSN (BB_HEAD (bb))); >> + } >> end = skip_insns_after_block (bb); >> if (NEXT_INSN (BB_END (bb)) && BB_END (bb) != end) >> BB_FOOTER (bb) = unlink_insn_chain (NEXT_INSN (BB_END (bb)), end); >> @@ -3000,7 +3220,7 @@ outof_cfg_layout_mode (void) >> if (bb->next_bb != EXIT_BLOCK_PTR) >> bb->aux = bb->next_bb; >> >> - cfg_layout_finalize (); >> + cfg_layout_finalize (false); >> >> return 0; >> } >> @@ -3120,10 +3340,13 @@ relink_block_chain (bool stay_in_cfglayout_mode) >> } >> >> >> -/* Given a reorder chain, rearrange the code to match. */ >> +/* Given a reorder chain, rearrange the code to match. If >> + this is called when we will FINALIZE_REORDER_BLOCKS, or when >> + section boundary notes were removed on entry to cfg layout >> + mode, insert section boundary notes here. */ >> >> static void >> -fixup_reorder_chain (void) >> +fixup_reorder_chain (bool finalize_reorder_blocks) >> { >> basic_block bb; >> rtx insn = NULL; >> @@ -3150,7 +3373,7 @@ static void >> PREV_INSN (BB_HEADER (bb)) = insn; >> insn = BB_HEADER (bb); >> while (NEXT_INSN (insn)) >> - insn = NEXT_INSN (insn); >> + insn = NEXT_INSN (insn); >> } >> if (insn) >> NEXT_INSN (insn) = BB_HEAD (bb); >> @@ -3175,6 +3398,11 @@ static void >> insn = NEXT_INSN (insn); >> >> set_last_insn (insn); >> + >> + /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ >> + if (had_sec_boundary_notes || finalize_reorder_blocks) >> + insert_section_boundary_note (); >> + >> #ifdef ENABLE_CHECKING >> verify_insn_chain (); >> #endif >> @@ -3187,7 +3415,7 @@ static void >> edge e_fall, e_taken, e; >> rtx bb_end_insn; >> rtx ret_label = NULL_RTX; >> - basic_block nb, src_bb; >> + basic_block nb; >> edge_iterator ei; >> >> if (EDGE_COUNT (bb->succs) == 0) >> @@ -3322,7 +3550,6 @@ static void >> /* We got here if we need to add a new jump insn. >> Note force_nonfallthru can delete E_FALL and thus we have to >> save E_FALL->src prior to the call to force_nonfallthru. */ >> - src_bb = e_fall->src; >> nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); >> if (nb) >> { >> @@ -3330,17 +3557,6 @@ static void >> bb->aux = nb; >> /* Don't process this new block. */ >> bb = nb; >> - >> - /* Make sure new bb is tagged for correct section (same as >> - fall-thru source, since you cannot fall-thru across >> - section boundaries). */ >> - BB_COPY_PARTITION (src_bb, single_pred (bb)); >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && JUMP_P (BB_END (bb)) >> - && !any_condjump_p (BB_END (bb)) >> - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) >> - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); >> } >> } >> >> @@ -3644,10 +3860,11 @@ duplicate_insn_chain (rtx from, rtx to) >> case NOTE_INSN_FUNCTION_BEG: >> /* There is always just single entry to function. */ >> case NOTE_INSN_BASIC_BLOCK: >> + /* We should only switch text sections once. */ >> + case NOTE_INSN_SWITCH_TEXT_SECTIONS: >> break; >> >> case NOTE_INSN_EPILOGUE_BEG: >> - case NOTE_INSN_SWITCH_TEXT_SECTIONS: >> emit_note_copy (insn); >> break; >> >> @@ -3759,10 +3976,13 @@ break_superblocks (void) >> } >> >> /* Finalize the changes: reorder insn list according to the sequence >> specified >> - by aux pointers, enter compensation code, rebuild scope forest. */ >> + by aux pointers, enter compensation code, rebuild scope forest. If >> + this is called when we will FINALIZE_REORDER_BLOCKS, indicate that >> + to fixup_reorder_chain so that it can insert the proper switch text >> + section notes. */ >> >> void >> -cfg_layout_finalize (void) >> +cfg_layout_finalize (bool finalize_reorder_blocks) >> { >> #ifdef ENABLE_CHECKING >> verify_flow_info (); >> @@ -3775,7 +3995,7 @@ void >> #endif >> ) >> fixup_fallthru_exit_predecessor (); >> - fixup_reorder_chain (); >> + fixup_reorder_chain (finalize_reorder_blocks); >> >> rebuild_jump_labels (get_insns ()); >> delete_dead_jumptables (); >> @@ -4454,8 +4674,7 @@ rtl_can_remove_branch_p (const_edge e) >> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >> return false; >> >> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >> - || BB_PARTITION (src) != BB_PARTITION (target)) >> + if (BB_PARTITION (src) != BB_PARTITION (target)) >> return false; >> >> if (!onlyjump_p (insn) >> >> -- >> This patch is available for review at http://codereview.appspot.com/6823047 > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413