On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher <stevenb....@gmail.com> wrote:
> On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote:
>> Sure, I will give this a try after your verification patch tests
>> complete. Does this mean that the patch you posted above to
>> force_nonfallthru_and_redirect is no longer needed either? I'll see if
>> I can avoid the need for some of my fixes, although I believe at least
>> the function.c one will still be needed. I'll check.
>
> The force_nonfallthru change is still necessary, because
> force_nonfallthru should be almost a no-op in cfglayout mode. The
> whole concept of a fallthru edge doesn't exist in cfglayout mode: any
> single_succ edge is a fallthru edge until the order of the basic
> blocks has been determined and the insn chain is re-linked (cfglayout
> mode originally was developed for bb-reorder, to move blocks around
> more easily). So the correct patch would actually be:
>
> Index: cfgrtl.c
> ===================================================================
> --- cfgrtl.c    (revision 193046)
> +++ cfgrtl.c    (working copy)
> @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
>    cfg_layout_split_edge,
>    rtl_make_forwarder_block,
>    NULL, /* tidy_fallthru_edge */
> -  rtl_force_nonfallthru,
> +  NULL, /* force_nonfallthru */
>    rtl_block_ends_with_call_p,
>    rtl_block_ends_with_condjump_p,
>    rtl_flow_call_edges_add,
>
> (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge
> hooks, they are cfgrtl-only.)
>
> But obviously that won't work because
> bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in
> cfglayout mode. That is a bug. The call to force_nonfallthru results
> in a "dangling" barrier:
>
> cfgrtl.c:1523  emit_barrier_after (BB_END (jump_block));
>
> In cfglayout mode, barriers don't exist in the insns chain, and they
> don't have to because every edge is a fallthru edge. If there are
> barriers before cfglayout mode, they are either removed or linked in
> the basic block footer, and fixup_reorder_chain restores or inserts
> barriers where necessary to drop out of cfglayout mode. This
> emit_barrier_after call hangs a barrier after BB_END but not in the
> footer, and I'm pretty sure the result will be that the barrier is
> lost in fixup_reorder_chain. See also emit_barrier_after_bb for how
> inserting a barrier should be done in cfglayout mode.
>
> So in short, bbpart doesn't know what it wants to be: a cfgrtl or a
> cfglayout pass. It doesn't work without cfglayout but it's doing
> things that are only correct in the cfgrtl world and Very Wrong Indeed
> in cfglayout-land.
>
>
>> Regarding your earlier question about why we needed to add the
>> barrier, I need to dig up the details again but essentially I found
>> that the barriers were being added by bbpart, but bbro was reordering
>> things and the block that ended up at the border between the hot and
>> cold section didn't necessarily have a barrier on it because it was
>> not previously at the region boundary.
>
> That doesn't sound right. bbpart doesn't actually re-order the basic
> blocks, it only marks the blocks with the partition they will be
> assigned to. Whatever ends up at the border between the two partitions
> is not relevant: the hot section cannot end in a fall-through edge to
> the cold section (verify_flow_info even checks for that, see "fallthru
> edge crosses section boundary (bb %i)") so it must end in some
> explicit jump. Such jumps are always followed by a barrier. The only
> reason I can think of why there might be a missing barrier, is because
> fixup_reorder_chain has a bug and forgets to insert the barrier in
> some cases (and I suspect this may be the case for return patterns, or
> the a.m. issue of a dropper barrier).
>
> I would like to work on debugging this, but it's hard without test cases...

I'm working on trying to reproduce some of these failures in a test
case I can share. So far, I have only been able to reproduce the
failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still
working on getting a smaller/shareable test case for the other 2
issues.

The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I
had fixed with my original changes to cfgrtl.c. Need to understand why
there is a reg crossing note between 2 bbs in the same partition.

In the hmmer test case I also hit a failures in rtl_verify_flow_info
and rtl_verify_flow_info_1:

gcc  -c -o sre_math.o -DSPEC_CPU -D
NDEBUG    -fprofile-use -freorder-blocks-and-partition   -O2
     sre_math.c
sre_math.c: In function ‘Gammln’:
sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section
 }
 ^
sre_math.c:161:1: error: missing barrier after block 6
sre_math.c:161:1: internal compiler error: verify_flow_info failed


This was due to some code in thread_prologue_and_epilogue_insns that
duplicated tail blocks:

                if (e)
                  {
                    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
                                                  NULL_RTX, e->src);
                    BB_COPY_PARTITION (copy_bb, e->src);
                  }

In this case e->src (bb 6) was in the cold section and e->dest was in
the hot section, and e->src ended with a REG_CROSSING_JUMP followed by
a barrier. The new copy_bb got put into the cold section by the copy
partition above, leading to the first error. And because the
create_basic_block call inserted the new copy_bb before NEXT_INSN
(BB_END (e->src)), which in this case was the barrier, we ended up
without the barrier after the crossing edge.

I fixed this by making the following change:

--- function.c  (revision 192692)
+++ function.c  (working copy)
@@ -6249,9 +6249,18 @@ thread_prologue_and_epilogue_insns (void)
                    break;
                if (e)
                  {
+                    rtx note;
                    copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
                                                  NULL_RTX, e->src);
                    BB_COPY_PARTITION (copy_bb, e->src);
+                    /* Remove the region crossing note from jump at end of
+                       e->src if it exists.  */
+                    note = find_reg_note (BB_END (e->src),
REG_CROSSING_JUMP, NULL_RTX);
+                    if (note)
+                      /* There would also have been a barrier after
e->src, that
+                         is now after copy_bb, but that shouldn't be a
+                         problem?.  */
+                      remove_note (BB_END (e->src), note);
                  }
                else
                  {

But I am not sure this is really correct in all cases - for example,
what if another hot bb that also didn't require a prologue branched
into the new cloned tail sequence, which is now cold? E.g.
dup_block_and_redirect will redirect all predecessors that don't need
a prologue to the new copy.

I'm going to see if I can get the other 2 failures I had found to
trigger on spec or a smaller test case.

Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Reply via email to