Ping.
Teresa

On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson <tejohn...@google.com> wrote:
> Revised patch that fixes failures encountered when enabling
> -freorder-blocks-and-partition, including the failure reported in PR 53743.
>
> This includes new verification code to ensure no cold blocks dominate hot
> blocks contributed by Steven Bosscher.
>
> I attempted to make the handling of partition updates through the optimization
> passes much more consistent, removing a number of partial fixes in the code
> stream in the process. The code to fixup partitions (including the 
> BB_PARTITION
> assignement, region crossing jump notes, and switch text section notes) is
> now handled in a few centralized locations. For example, inside
> rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that 
> callers
> don't need to attempt the fixup themselves.
>
> For optimization passes that make adjustments to the cfg while in cfg layout
> mode that are not easy to fix up incrementally, the new routine
> fixup_partitions handles the cleanup globally. This does require calculation
> of the dominance relation, however, as far as I can tell the routines which
> now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
> are invoked typically once (or a small number of times in the case of
> try_optimize_cfg) per optimization pass. Additionally, I compared the
> -ftime-report output for some large fdo compilations and saw only minimal
> increases in the dominance computation times, which were only a tiny percent
> of the overall compile time.
>
> Additionally, I added a flag to the rtl_data structure to indicate whether
> any partitioning was actually performed, so that optimizations which were
> conservatively disabled whenever the flag_reorder_blocks_and_partition
> is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
> conservative for functions where no partitions were formed (e.g. they are
> completely hot).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with 
> SPEC2006 int
> benchmarks and internal google benchmarks using profile feedback and
> -freorder-blocks-and-partition to get more coverage. Ok for trunk?
>
> Thanks,
> Teresa
>
> 2012-11-14  Teresa Johnson  <tejohn...@google.com>
>             Steven Bosscher  <ste...@gcc.gnu.org>
>
>         * cfghooks.h (cfg_layout_finalize): New parameter.
>         * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize
>         parameter.
>         * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
>         as this is now done by redirect_edge_and_branch_force.
>         * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
>         barriers, new cfg_layout_finalize parameter, and don't store exit
>         predecessor BB until after it is potentially split.
>         * function.h (struct rtl_data): New flag has_bb_partition.
>         * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter.
>         * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if
>         any blocks in function actually partitioned.
>         (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean
>         up partitioning.
>         * bb-reorder.c (connect_traces): Only look for partitions and skip
>         block copying if any blocks in function actually partitioned.
>         (emit_barrier_after_bb): Handle insertion in non-cfglayout mode.
>         (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure
>         that no cold blocks dominate a hot block.
>         (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert
>         as this is now done by force_nonfallthru_and_redirect.
>         (add_reg_crossing_jump_notes): Handle the fact that some jumps may
>         already be marked with region crossing note.
>         (reorder_basic_blocks): Only need to verify partitions if any
>         blocks in function actually partitioned.
>         (insert_section_boundary_note): Only need to insert note if any
>         blocks in function actually partitioned.
>         (rest_of_handle_reorder_blocks): New cfg_layout_finalize
>         parameter, and remove call to insert_section_boundary_note as this
>         is now called via cfg_layout_finalize/fixup_reorder_chain.
>         (duplicate_computed_gotos): New cfg_layout_finalize
>         parameter.
>         (partition_hot_cold_basic_blocks): Set flag indicating function
>         has bb partitions.
>         * bb-reorder.h: Declare insert_section_boundary_note and
>         emit_barrier_after_bb, which are no longer static.
>         * basic-block.h: Declare new function fixup_partitions.
>         * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary
>         check for region crossing note.
>         (fixup_partition_crossing): New function.
>         (fixup_bb_partition): Ditto.
>         (rtl_redirect_edge_and_branch): Fixup partition boundaries.
>         (force_nonfallthru_and_redirect): Fixup partition boundaries,
>         remove old code that tried to do this. Emit barrier correctly
>         when we are in cfglayout mode.
>         (rtl_split_edge): Correctly fixup partition boundaries.
>         (commit_one_edge_insertion): Remove old code that tried to
>         fixup region crossing edge since this is now handled in
>         split_block, and set up insertion point correctly since
>         block may now end in a jump.
>         (commit_edge_insertions): Invoke fixup_partitions to sanitize 
> partition
>         boundaries after optimizations that modify cfg and before trying to
>         verify the flow info.
>         (fixup_partitions): New function.
>         (rtl_verify_flow_info_1): Add verification that no cold bbs dominate
>         hot bbs.
>         (record_effective_endpoints): Remove region-crossing notes and set 
> flag
>         indicating that they need to be reinserted on exit from cfglayout 
> mode.
>         (outof_cfg_layout_mode): New cfg_layout_finalize parameter.
>         (fixup_reorder_chain): Call insert_section_boundary_note if necessary.
>         Remove old code that attempted to fixup region crossing note as
>         this is now handled in force_nonfallthru_and_redirect.
>         (duplicate_insn_chain): Don't duplicate switch section notes.
>         (cfg_layout_finalize): Pass new parameter to fixup_reorder_chain.
>         (rtl_can_remove_branch_p): Remove unnecessary check for region 
> crossing
>         note.
>
> Index: cfghooks.h
> ===================================================================
> --- cfghooks.h  (revision 193376)
> +++ cfghooks.h  (working copy)
> @@ -204,7 +204,7 @@ extern void copy_bbs (basic_block *, unsigned, bas
>  void account_profile_record (struct profile_record *, int);
>
>  extern void cfg_layout_initialize (unsigned int);
> -extern void cfg_layout_finalize (void);
> +extern void cfg_layout_finalize (bool);
>
>  /* Hooks containers.  */
>  extern struct cfg_hooks gimple_cfg_hooks;
> @@ -218,4 +218,3 @@ extern void cfg_layout_rtl_register_cfg_hooks (voi
>  extern void gimple_register_cfg_hooks (void);
>  extern struct cfg_hooks get_cfg_hooks (void);
>  extern void set_cfg_hooks (struct cfg_hooks);
> -
> Index: modulo-sched.c
> ===================================================================
> --- modulo-sched.c      (revision 193376)
> +++ modulo-sched.c      (working copy)
> @@ -3354,7 +3354,7 @@ rest_of_handle_sms (void)
>      if (bb->next_bb != EXIT_BLOCK_PTR)
>        bb->aux = bb->next_bb;
>    free_dominance_info (CDI_DOMINATORS);
> -  cfg_layout_finalize ();
> +  cfg_layout_finalize (false);
>  #endif /* INSN_SCHEDULING */
>    return 0;
>  }
> Index: ifcvt.c
> ===================================================================
> --- ifcvt.c     (revision 193376)
> +++ ifcvt.c     (working copy)
> @@ -3900,10 +3900,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg
>    if (new_bb)
>      {
>        df_bb_replace (then_bb_index, new_bb);
> -      /* Since the fallthru edge was redirected from test_bb to new_bb,
> -         we need to ensure that new_bb is in the same partition as
> -         test bb (you can not fall through across section boundaries).  */
> -      BB_COPY_PARTITION (new_bb, test_bb);
> +      /* This should have been done above via force_nonfallthru_and_redirect
> +         (possibly called from redirect_edge_and_branch_force).  */
> +      gcc_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb));
>      }
>
>    num_true_changes++;
> Index: function.c
> ===================================================================
> --- function.c  (revision 193376)
> +++ function.c  (working copy)
> @@ -6249,8 +6249,10 @@ thread_prologue_and_epilogue_insns (void)
>                     break;
>                 if (e)
>                   {
> -                   copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
> -                                                 NULL_RTX, e->src);
> +                    /* Make sure we insert after any barriers.  */
> +                    rtx end = get_last_bb_insn (e->src);
> +                    copy_bb = create_basic_block (NEXT_INSN (end),
> +                                                  NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
>                   }
>                 else
> @@ -6475,7 +6477,7 @@ thread_prologue_and_epilogue_insns (void)
>         if (cur_bb->index >= NUM_FIXED_BLOCKS
>             && cur_bb->next_bb->index >= NUM_FIXED_BLOCKS)
>           cur_bb->aux = cur_bb->next_bb;
> -      cfg_layout_finalize ();
> +      cfg_layout_finalize (false);
>      }
>
>  epilogue_done:
> @@ -6517,7 +6519,7 @@ epilogue_done:
>        basic_block simple_return_block_cold = NULL;
>        edge pending_edge_hot = NULL;
>        edge pending_edge_cold = NULL;
> -      basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb;
> +      basic_block exit_pred;
>        int i;
>
>        gcc_assert (entry_edge != orig_entry_edge);
> @@ -6545,6 +6547,12 @@ epilogue_done:
>             else
>               pending_edge_cold = e;
>           }
> +
> +      /* Save a pointer to the exit's predecessor BB for use in
> +         inserting new BBs at the end of the function. Do this
> +         after the call to split_block above which may split
> +         the original exit pred.  */
> +      exit_pred = EXIT_BLOCK_PTR->prev_bb;
>
>        FOR_EACH_VEC_ELT (edge, unconverted_simple_returns, i, e)
>         {
> Index: function.h
> ===================================================================
> --- function.h  (revision 193376)
> +++ function.h  (working copy)
> @@ -459,6 +459,11 @@ struct GTY(()) rtl_data {
>       sched2) and is useful only if the port defines LEAF_REGISTERS.  */
>    bool uses_only_leaf_regs;
>
> +  /* Nonzero if the function being compiled has undergone hot/cold 
> partitioning
> +     (under flag_reorder_blocks_and_partition) and has at least one cold
> +     block.  */
> +  bool has_bb_partition;
> +
>    /* Like regs_ever_live, but 1 if a reg is set or clobbered from an
>       asm.  Unlike regs_ever_live, elements of this array corresponding
>       to eliminable regs (like the frame pointer) are set if an asm
> Index: hw-doloop.c
> ===================================================================
> --- hw-doloop.c (revision 193376)
> +++ hw-doloop.c (working copy)
> @@ -547,7 +547,7 @@ reorder_loops (hwloop_info loops)
>        else
>         bb->aux = NULL;
>      }
> -  cfg_layout_finalize ();
> +  cfg_layout_finalize (false);
>    clear_aux_for_blocks ();
>    df_analyze ();
>  }
> Index: cfgcleanup.c
> ===================================================================
> --- cfgcleanup.c        (revision 193376)
> +++ cfgcleanup.c        (working copy)
> @@ -1824,7 +1824,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2,
>       partition boundaries).  See the comments at the top of
>       bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */
>
> -  if (flag_reorder_blocks_and_partition && reload_completed)
> +  if (crtl->has_bb_partition && reload_completed)
>      return false;
>
>    /* Search backward through forwarder blocks.  We don't need to worry
> @@ -2767,10 +2767,21 @@ try_optimize_cfg (int mode)
>               df_analyze ();
>             }
>
> +         if (changed)
> +            {
> +              /* Edge forwarding in particular can cause hot blocks 
> previously
> +                 reached by both hot and cold blocks to become dominated only
> +                 by cold blocks. This will cause the verification below to 
> fail,
> +                 and lead to now cold code in the hot section. This is not 
> easy
> +                 to detect and fix during edge forwarding, and in some cases
> +                 is only visible after newly unreachable blocks are deleted,
> +                 which will be done in fixup_partitions.  */
> +              fixup_partitions ();
> +
>  #ifdef ENABLE_CHECKING
> -         if (changed)
> -           verify_flow_info ();
> +              verify_flow_info ();
>  #endif
> +            }
>
>           changed_overall |= changed;
>           first_pass = false;
> Index: bb-reorder.c
> ===================================================================
> --- bb-reorder.c        (revision 193376)
> +++ bb-reorder.c        (working copy)
> @@ -1054,7 +1054,7 @@ connect_traces (int n_traces, struct trace *traces
>    current_partition = BB_PARTITION (traces[0].first);
>    two_passes = false;
>
> -  if (flag_reorder_blocks_and_partition)
> +  if (crtl->has_bb_partition)
>      for (i = 0; i < n_traces && !two_passes; i++)
>        if (BB_PARTITION (traces[0].first)
>           != BB_PARTITION (traces[i].first))
> @@ -1263,7 +1263,7 @@ connect_traces (int n_traces, struct trace *traces
>                       }
>                   }
>
> -             if (flag_reorder_blocks_and_partition)
> +             if (crtl->has_bb_partition)
>                 try_copy = false;
>
>               /* Copy tiny blocks always; copy larger blocks only when the
> @@ -1381,13 +1381,14 @@ get_uncond_jump_length (void)
>    return length;
>  }
>
> -/* Emit a barrier into the footer of BB.  */
> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode.  */
>
> -static void
> +void
>  emit_barrier_after_bb (basic_block bb)
>  {
>    rtx barrier = emit_barrier_after (BB_END (bb));
> -  BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
> +  if (current_ir_type () == IR_RTL_CFGLAYOUT)
> +    BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
>  }
>
>  /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions.
> @@ -1463,18 +1464,109 @@ find_rarely_executed_basic_blocks_and_crossing_edg
>  {
>    VEC(edge, heap) *crossing_edges = NULL;
>    basic_block bb;
> -  edge e;
> -  edge_iterator ei;
> +  edge e, e2;
> +  edge_iterator ei, ei2;
> +  unsigned int cold_bb_count = 0;
> +  VEC (basic_block, heap) *bbs_in_hot_partition = NULL;
> +  VEC (basic_block, heap) *bbs_newly_hot = NULL;
>
>    /* Mark which partition (hot/cold) each basic block belongs in.  */
>    FOR_EACH_BB (bb)
>      {
>        if (probably_never_executed_bb_p (cfun, bb))
> -       BB_SET_PARTITION (bb, BB_COLD_PARTITION);
> +        {
> +          BB_SET_PARTITION (bb, BB_COLD_PARTITION);
> +          cold_bb_count++;
> +        }
>        else
> -       BB_SET_PARTITION (bb, BB_HOT_PARTITION);
> +        {
> +          BB_SET_PARTITION (bb, BB_HOT_PARTITION);
> +          VEC_safe_push (basic_block, heap, bbs_in_hot_partition, bb);
> +        }
>      }
>
> +  /* Ensure that no cold bbs dominate hot bbs. This could happen as a result 
> of
> +     several different possibilities. One is that there are edge weight 
> insanities
> +     due to optimization phases that do not properly update basic block 
> profile
> +     counts. The second is that the entry of the function may not be hot, 
> because
> +     it is entered fewer times than the number of profile training runs, but 
> there
> +     is a loop inside the function that causes blocks within the function to 
> be
> +     above the threshold for hotness.  */
> +  if (cold_bb_count)
> +    {
> +      bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
> +
> +      if (dom_calculated_here)
> +        calculate_dominance_info (CDI_DOMINATORS);
> +
> +      /* Keep examining hot bbs until we have either checked them all, or
> +         re-marked all cold bbs hot.  */
> +      while (! VEC_empty (basic_block, bbs_in_hot_partition)
> +             && cold_bb_count)
> +        {
> +          basic_block dom_bb;
> +
> +          bb = VEC_pop (basic_block, bbs_in_hot_partition);
> +          dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
> +
> +          /* If bb's immediate dominator is also hot then it is ok.  */
> +          if (BB_PARTITION (dom_bb) != BB_COLD_PARTITION)
> +            continue;
> +
> +          /* We have a hot bb with an immediate dominator that is cold.
> +             The dominator needs to be re-marked to hot.  */
> +          BB_SET_PARTITION (dom_bb, BB_HOT_PARTITION);
> +          cold_bb_count--;
> +
> +          /* Now we need to examine newly-hot dom_bb to see if it is also
> +             dominated by a cold bb.  */
> +          VEC_safe_push (basic_block, heap, bbs_in_hot_partition, dom_bb);
> +
> +          /* We should also adjust any cold blocks that the newly-hot bb
> +             feeds and see if it makes sense to re-mark those as hot as
> +             well.  */
> +          VEC_safe_push (basic_block, heap, bbs_newly_hot, dom_bb);
> +          while (! VEC_empty (basic_block, bbs_newly_hot))
> +            {
> +              basic_block new_hot_bb = VEC_pop (basic_block, bbs_newly_hot);
> +              /* Examine all successors of this newly-hot bb to see if they
> +                 are cold and should be re-marked as hot.  */
> +              FOR_EACH_EDGE (e, ei, new_hot_bb->succs)
> +                {
> +                  bool any_cold_preds = false;
> +                  basic_block succ = e->dest;
> +                  if (BB_PARTITION (succ) != BB_COLD_PARTITION)
> +                    continue;
> +                  /* Does this block have any cold predecessors now?  */
> +                  FOR_EACH_EDGE (e2, ei2, succ->preds)
> +                  {
> +                    if (BB_PARTITION (e2->src) == BB_COLD_PARTITION)
> +                      {
> +                        any_cold_preds = true;
> +                        break;
> +                      }
> +                  }
> +                  if (any_cold_preds)
> +                    continue;
> +
> +                  /* Here we have a successor of newly-hot bb that is cold
> +                     but no longer has any cold precessessors. Since the 
> original
> +                     assignment of our newly-hot bb was incorrect, this 
> successor's
> +                     assignment as cold is also suspect. Go ahead and 
> re-mark it
> +                     as hot now too. Better heuristics may be in order here. 
>  */
> +                  BB_SET_PARTITION (succ, BB_HOT_PARTITION);
> +                  cold_bb_count--;
> +                  VEC_safe_push (basic_block, heap, bbs_in_hot_partition, 
> succ);
> +                  /* Examine this successor as a newly-hot bb.  */
> +                  VEC_safe_push (basic_block, heap, bbs_newly_hot, succ);
> +                }
> +            }
> +        }
> +
> +      if (dom_calculated_here)
> +        free_dominance_info (CDI_DOMINATORS);
> +    }
> +
>    /* The format of .gcc_except_table does not allow landing pads to
>       be in a different partition as the throw.  Fix this by either
>       moving or duplicating the landing pads.  */
> @@ -1766,10 +1858,10 @@ fix_up_fall_thru_edges (void)
>                       new_bb->aux = cur_bb->aux;
>                       cur_bb->aux = new_bb;
>
> -                     /* Make sure new fall-through bb is in same
> -                        partition as bb it's falling through from.  */
> +                      /* This is done by force_nonfallthru_and_redirect.  */
> +                     gcc_assert (BB_PARTITION (new_bb)
> +                                  == BB_PARTITION (cur_bb));
>
> -                     BB_COPY_PARTITION (new_bb, cur_bb);
>                       single_succ_edge (new_bb)->flags |= EDGE_CROSSING;
>                     }
>                   else
> @@ -2067,7 +2159,10 @@ add_reg_crossing_jump_notes (void)
>    FOR_EACH_BB (bb)
>      FOR_EACH_EDGE (e, ei, bb->succs)
>        if ((e->flags & EDGE_CROSSING)
> -         && JUMP_P (BB_END (e->src)))
> +         && JUMP_P (BB_END (e->src))
> +          /* Some notes were added during fix_up_fall_thru_edges, via
> +             force_nonfallthru_and_redirect.  */
> +          && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX))
>         add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
>  }
>
> @@ -2160,7 +2255,7 @@ reorder_basic_blocks (void)
>        dump_flow_info (dump_file, dump_flags);
>      }
>
> -  if (flag_reorder_blocks_and_partition)
> +  if (crtl->has_bb_partition)
>      verify_hot_cold_block_grouping ();
>  }
>
> @@ -2172,14 +2267,14 @@ reorder_basic_blocks (void)
>     encountering this note will make the compiler switch between the
>     hot and cold text sections.  */
>
> -static void
> +void
>  insert_section_boundary_note (void)
>  {
>    basic_block bb;
>    rtx new_note;
>    int first_partition = 0;
>
> -  if (!flag_reorder_blocks_and_partition)
> +  if (!crtl->has_bb_partition)
>      return;
>
>    FOR_EACH_BB (bb)
> @@ -2222,10 +2317,8 @@ rest_of_handle_reorder_blocks (void)
>    FOR_EACH_BB (bb)
>      if (bb->next_bb != EXIT_BLOCK_PTR)
>        bb->aux = bb->next_bb;
> -  cfg_layout_finalize ();
> +  cfg_layout_finalize (true);
>
> -  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
> -  insert_section_boundary_note ();
>    return 0;
>  }
>
> @@ -2366,7 +2459,7 @@ duplicate_computed_gotos (void)
>      }
>
>  done:
> -  cfg_layout_finalize ();
> +  cfg_layout_finalize (false);
>
>    BITMAP_FREE (candidates);
>    return 0;
> @@ -2511,6 +2604,8 @@ partition_hot_cold_basic_blocks (void)
>    if (crossing_edges == NULL)
>      return 0;
>
> +  crtl->has_bb_partition = true;
> +
>    /* Make sure the source of any crossing edge ends in a jump and the
>       destination of any crossing edge has a label.  */
>    add_labels_and_missing_jumps (crossing_edges);
> Index: bb-reorder.h
> ===================================================================
> --- bb-reorder.h        (revision 193376)
> +++ bb-reorder.h        (working copy)
> @@ -36,4 +36,8 @@ extern struct target_bb_reorder *this_target_bb_re
>
>  extern int get_uncond_jump_length (void);
>
> +extern void insert_section_boundary_note (void);
> +
> +extern void emit_barrier_after_bb (basic_block bb);
> +
>  #endif
> Index: basic-block.h
> ===================================================================
> --- basic-block.h       (revision 193376)
> +++ basic-block.h       (working copy)
> @@ -806,6 +806,7 @@ extern basic_block force_nonfallthru_and_redirect
>  extern bool contains_no_active_insn_p (const_basic_block);
>  extern bool forwarder_block_p (const_basic_block);
>  extern bool can_fallthru (basic_block, basic_block);
> +extern void fixup_partitions (void);
>
>  /* In cfgbuild.c.  */
>  extern void find_many_sub_basic_blocks (sbitmap);
> Index: cfgrtl.c
> ===================================================================
> --- cfgrtl.c    (revision 193376)
> +++ cfgrtl.c    (working copy)
> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree.h"
>  #include "hard-reg-set.h"
>  #include "basic-block.h"
> +#include "bb-reorder.h"
>  #include "regs.h"
>  #include "flags.h"
>  #include "function.h"
> @@ -67,11 +68,12 @@ along with GCC; see the file COPYING3.  If not see
>     Only applicable if the CFG is in cfglayout mode.  */
>  static GTY(()) rtx cfg_layout_function_footer;
>  static GTY(()) rtx cfg_layout_function_header;
> +static bool had_sec_boundary_notes;
>
>  static rtx skip_insns_after_block (basic_block);
>  static void record_effective_endpoints (void);
>  static rtx label_for_bb (basic_block);
> -static void fixup_reorder_chain (void);
> +static void fixup_reorder_chain (bool finalize_reorder_blocks);
>
>  void verify_insn_chain (void);
>  static void fixup_fallthru_exit_predecessor (void);
> @@ -976,8 +978,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc
>       partition boundaries).  See  the comments at the top of
>       bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */
>
> -  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> -      || BB_PARTITION (src) != BB_PARTITION (target))
> +  if (BB_PARTITION (src) != BB_PARTITION (target))
>      return NULL;
>
>    /* We can replace or remove a complex jump only when we have exactly
> @@ -1286,6 +1287,71 @@ redirect_branch_edge (edge e, basic_block target)
>    return e;
>  }
>
> +/* Called when edge E has been redirected to a new destination,
> +   in order to update the region crossing flag on the edge and
> +   jump.  */
> +
> +static void
> +fixup_partition_crossing (edge e, basic_block target)
> +{
> +  rtx note;
> +
> +  gcc_assert (e->dest == target);
> +
> +  if (e->src == ENTRY_BLOCK_PTR || target == EXIT_BLOCK_PTR)
> +    return;
> +  /* If we redirected an existing edge, it may already be marked
> +     crossing, even though the new src is missing a reg crossing note.
> +     But make sure reg crossing note doesn't already exist before
> +     inserting.  */
> +  if (BB_PARTITION (e->src) != BB_PARTITION (target))
> +    {
> +      e->flags |= EDGE_CROSSING;
> +      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +      if (JUMP_P (BB_END (e->src))
> +          && !note)
> +        add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +    }
> +  else if (BB_PARTITION (e->src) == BB_PARTITION (target))
> +    {
> +      e->flags &= ~EDGE_CROSSING;
> +      /* Remove the region crossing note from jump at end of
> +         e->src if it exists.  */
> +      note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX);
> +      if (note)
> +        remove_note (BB_END (e->src), note);
> +    }
> +}
> +
> +/* Called when block BB has been reassigned to a different partition,
> +   to ensure that the region crossing attributes are updated.  */
> +
> +static void
> +fixup_bb_partition (basic_block bb)
> +{
> +  edge e;
> +  edge_iterator ei;
> +
> +  /* Now need to make bb's pred edges non-region crossing.  */
> +  FOR_EACH_EDGE (e, ei, bb->preds)
> +    {
> +      fixup_partition_crossing (e, e->dest);
> +    }
> +
> +  /* Possibly need to make bb's successor edges region crossing,
> +     or remove stale region crossing.  */
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +    {
> +      if ((e->flags & EDGE_FALLTHRU)
> +          && BB_PARTITION (bb) != BB_PARTITION (e->dest)
> +          && e->dest != EXIT_BLOCK_PTR)
> +        /* force_nonfallthru_and_redirect calls fixup_partition_crossing.  */
> +        force_nonfallthru (e);
> +      else
> +        fixup_partition_crossing (e, e->dest);
> +    }
> +}
> +
>  /* Attempt to change code to redirect edge E to TARGET.  Don't do that on
>     expense of adding new instructions or reordering basic blocks.
>
> @@ -1302,16 +1368,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>  {
>    edge ret;
>    basic_block src = e->src;
> +  basic_block dest = e->dest;
>
>    if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>      return NULL;
>
> -  if (e->dest == target)
> +  if (dest == target)
>      return e;
>
>    if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL)
>      {
>        df_set_bb_dirty (src);
> +      fixup_partition_crossing (ret, target);
>        return ret;
>      }
>
> @@ -1320,6 +1388,7 @@ rtl_redirect_edge_and_branch (edge e, basic_block
>      return NULL;
>
>    df_set_bb_dirty (src);
> +  fixup_partition_crossing (ret, target);
>    return ret;
>  }
>
> @@ -1454,18 +1523,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>        /* Make sure new block ends up in correct hot/cold section.  */
>
>        BB_COPY_PARTITION (jump_block, e->src);
> -      if (flag_reorder_blocks_and_partition
> -         && targetm_common.have_named_sections
> -         && JUMP_P (BB_END (jump_block))
> -         && !any_condjump_p (BB_END (jump_block))
> -         && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING))
> -       add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX);
>
>        /* Wire edge in.  */
>        new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU);
>        new_edge->probability = probability;
>        new_edge->count = count;
>
> +      /* If e->src was previously region crossing, it no longer is
> +         and the reg crossing note should be removed.  */
> +      fixup_partition_crossing (new_edge, jump_block);
> +
>        /* Redirect old edge.  */
>        redirect_edge_pred (e, jump_block);
>        e->probability = REG_BR_PROB_BASE;
> @@ -1521,13 +1588,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc
>        LABEL_NUSES (label)++;
>      }
>
> -  emit_barrier_after (BB_END (jump_block));
> +  /* We might be in cfg layout mode, and if so, the following routine will
> +     insert the barrier correctly.  */
> +  emit_barrier_after_bb (jump_block);
>    redirect_edge_succ_nodup (e, target);
>
>    if (abnormal_edge_flags)
>      make_edge (src, target, abnormal_edge_flags);
>
>    df_mark_solutions_dirty ();
> +  fixup_partition_crossing (e, target);
>    return new_bb;
>  }
>
> @@ -1626,7 +1696,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU
>  static basic_block
>  rtl_split_edge (edge edge_in)
>  {
> -  basic_block bb;
> +  basic_block bb, new_bb;
>    rtx before;
>
>    /* Abnormal edges cannot be split.  */
> @@ -1659,12 +1729,26 @@ rtl_split_edge (edge edge_in)
>    else
>      {
>        bb = create_basic_block (before, NULL, edge_in->dest->prev_bb);
> -      /* ??? Why not edge_in->dest->prev_bb here?  */
> -      BB_COPY_PARTITION (bb, edge_in->dest);
> +      if (edge_in->src == ENTRY_BLOCK_PTR)
> +        BB_COPY_PARTITION (bb, edge_in->dest);
> +      else
> +        /* Put the split bb into the src partition, to avoid creating
> +           a situation where a cold bb dominates a hot bb, in the case
> +           where src is cold and dest is hot. The src will dominate
> +           the new bb (whereas it might not have dominated dest).  */
> +        BB_COPY_PARTITION (bb, edge_in->src);
>      }
>
>    make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU);
>
> +  /* Can't allow a region crossing edge to be fallthrough.  */
> +  if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest)
> +      && edge_in->dest != EXIT_BLOCK_PTR)
> +    {
> +      new_bb = force_nonfallthru (single_succ_edge (bb));
> +      gcc_assert (!new_bb);
> +    }
> +
>    /* For non-fallthru edges, we must adjust the predecessor's
>       jump instruction to target our new block.  */
>    if ((edge_in->flags & EDGE_FALLTHRU) == 0)
> @@ -1777,17 +1861,13 @@ commit_one_edge_insertion (edge e)
>    else
>      {
>        bb = split_edge (e);
> -      after = BB_END (bb);
>
> -      if (flag_reorder_blocks_and_partition
> -         && targetm_common.have_named_sections
> -         && e->src != ENTRY_BLOCK_PTR
> -         && BB_PARTITION (e->src) == BB_COLD_PARTITION
> -         && !(e->flags & EDGE_CROSSING)
> -         && JUMP_P (after)
> -         && !any_condjump_p (after)
> -         && (single_succ_edge (bb)->flags & EDGE_CROSSING))
> -       add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX);
> +      /* If e crossed a partition boundary, we needed to make bb end in
> +         a region-crossing jump, even though it was originally fallthru.  */
> +      if (JUMP_P (BB_END (bb)))
> +       before = BB_END (bb);
> +      else
> +        after = BB_END (bb);
>      }
>
>    /* Now that we've found the spot, do the insertion.  */
> @@ -1827,6 +1907,14 @@ commit_edge_insertions (void)
>  {
>    basic_block bb;
>
> +  /* Optimization passes that invoke this routine can cause hot blocks
> +     previously reached by both hot and cold blocks to become dominated only
> +     by cold blocks. This will cause the verification below to fail,
> +     and lead to now cold code in the hot section. In some cases this
> +     may only be visible after newly unreachable blocks are deleted,
> +     which will be done by fixup_partitions.  */
> +  fixup_partitions ();
> +
>  #ifdef ENABLE_CHECKING
>    verify_flow_info ();
>  #endif
> @@ -2028,7 +2116,75 @@ get_last_bb_insn (basic_block bb)
>
>    return end;
>  }
> -
> +
> +/* Perform cleanup on the hot/cold bb partitioning after optimization
> +   passes that modify the cfg.  */
> +
> +void
> +fixup_partitions (void)
> +{
> +  basic_block bb;
> +
> +  if (!crtl->has_bb_partition)
> +    return;
> +
> +  /* Delete any blocks that became unreachable and weren't
> +     already cleaned up, for example during edge forwarding
> +     and convert_jumps_to_returns. This will expose more
> +     opportunities for fixing the partition boundaries here.
> +     Also, the calculation of the dominance graph during verification
> +     will assert if there are unreachable nodes.  */
> +  delete_unreachable_blocks ();
> +
> +  /* If there are partitions, do a sanity check on them: A basic block in
> +     a cold partition cannot dominate a basic block in a hot partition.
> +     Fixup any that now violate this requirement, as a result of edge
> +     forwarding and unreachable block deletion.  */
> +  VEC (basic_block, heap) *bbs_in_cold_partition = NULL;
> +  VEC (basic_block, heap) *bbs_to_fix = NULL;
> +  FOR_EACH_BB (bb)
> +    if ((BB_PARTITION (bb) == BB_COLD_PARTITION))
> +      VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb);
> +  if (! VEC_empty (basic_block, bbs_in_cold_partition))
> +    {
> +      bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
> +      basic_block son;
> +
> +      if (dom_calculated_here)
> +        calculate_dominance_info (CDI_DOMINATORS);
> +
> +      while (! VEC_empty (basic_block, bbs_in_cold_partition))
> +        {
> +          bb = VEC_pop (basic_block, bbs_in_cold_partition);
> +          /* If bb is not yet cold (because it was added below as
> +             a block dominated by a cold bb) then mark it cold here.  */
> +          if ((BB_PARTITION (bb) != BB_COLD_PARTITION))
> +            {
> +              BB_SET_PARTITION (bb, BB_COLD_PARTITION);
> +              VEC_safe_push (basic_block, heap, bbs_to_fix, bb);
> +            }
> +          /* Any blocks dominated by a block in the cold section
> +             must also be cold.  */
> +          for (son = first_dom_son (CDI_DOMINATORS, bb);
> +               son;
> +               son = next_dom_son (CDI_DOMINATORS, son))
> +            VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son);
> +        }
> +
> +      if (dom_calculated_here)
> +        free_dominance_info (CDI_DOMINATORS);
> +    }
> +
> +  /* Do the partition fixup after all necessary blocks have been converted to
> +     cold, so that we only update the region crossings the minimum number of
> +     places, which can require forcing edges to be non fallthru.  */
> +  while (! VEC_empty (basic_block, bbs_to_fix))
> +    {
> +      bb = VEC_pop (basic_block, bbs_to_fix);
> +      fixup_bb_partition (bb);
> +    }
> +}
> +
>  /* Verify the CFG and RTL consistency common for both underlying RTL and
>     cfglayout RTL.
>
> @@ -2052,6 +2208,7 @@ rtl_verify_flow_info_1 (void)
>    rtx x;
>    int err = 0;
>    basic_block bb;
> +  bool have_partitions = false;
>
>    /* Check the general integrity of the basic blocks.  */
>    FOR_EACH_BB_REVERSE (bb)
> @@ -2169,6 +2326,8 @@ rtl_verify_flow_info_1 (void)
>
>           if (e->flags & EDGE_ABNORMAL)
>             n_abnormal++;
> +
> +          have_partitions |= is_crossing;
>         }
>
>        if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX))
> @@ -2293,6 +2452,40 @@ rtl_verify_flow_info_1 (void)
>           }
>      }
>
> +  /* If there are partitions, do a sanity check on them: A basic block in
> +     a cold partition cannot dominate a basic block in a hot partition.  */
> +  VEC (basic_block, heap) *bbs_in_cold_partition = NULL;
> +  if (have_partitions && !err)
> +    FOR_EACH_BB (bb)
> +      if ((BB_PARTITION (bb) == BB_COLD_PARTITION))
> +        VEC_safe_push (basic_block, heap, bbs_in_cold_partition, bb);
> +  if (! VEC_empty (basic_block, bbs_in_cold_partition))
> +    {
> +      bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS);
> +      basic_block son;
> +
> +      if (dom_calculated_here)
> +        calculate_dominance_info (CDI_DOMINATORS);
> +
> +      while (! VEC_empty (basic_block, bbs_in_cold_partition))
> +        {
> +          bb = VEC_pop (basic_block, bbs_in_cold_partition);
> +          if ((BB_PARTITION (bb) != BB_COLD_PARTITION))
> +            {
> +              error ("non-cold basic block %d dominated "
> +                     "by a block in the cold partition", bb->index);
> +              err = 1;
> +            }
> +          for (son = first_dom_son (CDI_DOMINATORS, bb);
> +               son;
> +               son = next_dom_son (CDI_DOMINATORS, son))
> +            VEC_safe_push (basic_block, heap, bbs_in_cold_partition, son);
> +        }
> +
> +      if (dom_calculated_here)
> +        free_dominance_info (CDI_DOMINATORS);
> +    }
> +
>    /* Clean up.  */
>    return err;
>  }
> @@ -2965,14 +3158,41 @@ record_effective_endpoints (void)
>    else
>      cfg_layout_function_header = NULL_RTX;
>
> +  had_sec_boundary_notes = false;
> +
>    next_insn = get_insns ();
>    FOR_EACH_BB (bb)
>      {
>        rtx end;
>
>        if (PREV_INSN (BB_HEAD (bb)) && next_insn != BB_HEAD (bb))
> -       BB_HEADER (bb) = unlink_insn_chain (next_insn,
> -                                             PREV_INSN (BB_HEAD (bb)));
> +        {
> +          /* Rather than try to keep section boundary notes incrementally
> +             up-to-date through cfg layout optimizations, simply remove them
> +             and flag that they should be re-inserted when exiting
> +             cfg layout mode.  */
> +          rtx check_insn = next_insn;
> +          while (check_insn)
> +            {
> +              if (NOTE_P (check_insn)
> +                  && NOTE_KIND (check_insn) == 
> NOTE_INSN_SWITCH_TEXT_SECTIONS)
> +              {
> +                had_sec_boundary_notes |= true;
> +                /* Remove note from chain. Grab new next_insn first.  */
> +                if (next_insn == check_insn)
> +                  next_insn = NEXT_INSN (check_insn);
> +                /* Delete note.  */
> +                delete_insn (check_insn);
> +                /* There will only be one.  */
> +                break;
> +              }
> +              check_insn = NEXT_INSN (check_insn);
> +            }
> +          /* If we still have header instructions left after above loop.  */
> +          if (next_insn != BB_HEAD (bb))
> +            BB_HEADER (bb) = unlink_insn_chain (next_insn,
> +                                                PREV_INSN (BB_HEAD (bb)));
> +        }
>        end = skip_insns_after_block (bb);
>        if (NEXT_INSN (BB_END (bb)) && BB_END (bb) != end)
>         BB_FOOTER (bb) = unlink_insn_chain (NEXT_INSN (BB_END (bb)), end);
> @@ -3000,7 +3220,7 @@ outof_cfg_layout_mode (void)
>      if (bb->next_bb != EXIT_BLOCK_PTR)
>        bb->aux = bb->next_bb;
>
> -  cfg_layout_finalize ();
> +  cfg_layout_finalize (false);
>
>    return 0;
>  }
> @@ -3120,10 +3340,13 @@ relink_block_chain (bool stay_in_cfglayout_mode)
>  }
>
>
> -/* Given a reorder chain, rearrange the code to match.  */
> +/* Given a reorder chain, rearrange the code to match. If
> +   this is called when we will FINALIZE_REORDER_BLOCKS, or when
> +   section boundary notes were removed on entry to cfg layout
> +   mode, insert section boundary notes here.  */
>
>  static void
> -fixup_reorder_chain (void)
> +fixup_reorder_chain (bool finalize_reorder_blocks)
>  {
>    basic_block bb;
>    rtx insn = NULL;
> @@ -3150,7 +3373,7 @@ static void
>           PREV_INSN (BB_HEADER (bb)) = insn;
>           insn = BB_HEADER (bb);
>           while (NEXT_INSN (insn))
> -           insn = NEXT_INSN (insn);
> +            insn = NEXT_INSN (insn);
>         }
>        if (insn)
>         NEXT_INSN (insn) = BB_HEAD (bb);
> @@ -3175,6 +3398,11 @@ static void
>      insn = NEXT_INSN (insn);
>
>    set_last_insn (insn);
> +
> +  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
> +  if (had_sec_boundary_notes || finalize_reorder_blocks)
> +    insert_section_boundary_note ();
> +
>  #ifdef ENABLE_CHECKING
>    verify_insn_chain ();
>  #endif
> @@ -3187,7 +3415,7 @@ static void
>        edge e_fall, e_taken, e;
>        rtx bb_end_insn;
>        rtx ret_label = NULL_RTX;
> -      basic_block nb, src_bb;
> +      basic_block nb;
>        edge_iterator ei;
>
>        if (EDGE_COUNT (bb->succs) == 0)
> @@ -3322,7 +3550,6 @@ static void
>        /* We got here if we need to add a new jump insn.
>          Note force_nonfallthru can delete E_FALL and thus we have to
>          save E_FALL->src prior to the call to force_nonfallthru.  */
> -      src_bb = e_fall->src;
>        nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
>        if (nb)
>         {
> @@ -3330,17 +3557,6 @@ static void
>           bb->aux = nb;
>           /* Don't process this new block.  */
>           bb = nb;
> -
> -         /* Make sure new bb is tagged for correct section (same as
> -            fall-thru source, since you cannot fall-thru across
> -            section boundaries).  */
> -         BB_COPY_PARTITION (src_bb, single_pred (bb));
> -         if (flag_reorder_blocks_and_partition
> -             && targetm_common.have_named_sections
> -             && JUMP_P (BB_END (bb))
> -             && !any_condjump_p (BB_END (bb))
> -             && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING))
> -           add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX);
>         }
>      }
>
> @@ -3644,10 +3860,11 @@ duplicate_insn_chain (rtx from, rtx to)
>             case NOTE_INSN_FUNCTION_BEG:
>               /* There is always just single entry to function.  */
>             case NOTE_INSN_BASIC_BLOCK:
> +              /* We should only switch text sections once.  */
> +           case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>               break;
>
>             case NOTE_INSN_EPILOGUE_BEG:
> -           case NOTE_INSN_SWITCH_TEXT_SECTIONS:
>               emit_note_copy (insn);
>               break;
>
> @@ -3759,10 +3976,13 @@ break_superblocks (void)
>  }
>
>  /* Finalize the changes: reorder insn list according to the sequence 
> specified
> -   by aux pointers, enter compensation code, rebuild scope forest.  */
> +   by aux pointers, enter compensation code, rebuild scope forest. If
> +   this is called when we will FINALIZE_REORDER_BLOCKS, indicate that
> +   to fixup_reorder_chain so that it can insert the proper switch text
> +   section notes.  */
>
>  void
> -cfg_layout_finalize (void)
> +cfg_layout_finalize (bool finalize_reorder_blocks)
>  {
>  #ifdef ENABLE_CHECKING
>    verify_flow_info ();
> @@ -3775,7 +3995,7 @@ void
>  #endif
>        )
>      fixup_fallthru_exit_predecessor ();
> -  fixup_reorder_chain ();
> +  fixup_reorder_chain (finalize_reorder_blocks);
>
>    rebuild_jump_labels (get_insns ());
>    delete_dead_jumptables ();
> @@ -4454,8 +4674,7 @@ rtl_can_remove_branch_p (const_edge e)
>    if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH))
>      return false;
>
> -  if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)
> -      || BB_PARTITION (src) != BB_PARTITION (target))
> +  if (BB_PARTITION (src) != BB_PARTITION (target))
>      return false;
>
>    if (!onlyjump_p (insn)
>
> --
> This patch is available for review at http://codereview.appspot.com/6823047



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Reply via email to