On April 9, 2018 4:17:45 PM GMT+02:00, Jan Hubicka <hubi...@ucw.cz> wrote: >Hi, >this patch fixes most offending artifacts across the function partition >bounary with -freorder-blocks-and-partitions which is now on by default >for i386 (no other targets). The problem is that most of cfgcleanup >and >cfgrtl is disabled on crossing edges and thus we produce pretty ugly >code. > >The comment I removed reffers to non-existing comment, but the fixup >of partitioning only looks for crossing jumps. It generates new >patterns for unconditional jumps and ads trampolines for conditional >jumps if taret requests it (i386 doesn't). > >It is thus safe to turn any far jump to near jump (as this patch does) >and based on target we can do more (this patch doesn't) > >For this late of stage4 I think we are fine with enabling edge >forwarding >only. This saves about 0.5% of code segment on cc1 (measured with and >without patch so the code pays back at least) with profiledbootstrap. >I have patches to fix other issues but I think they can wait for next >stage1. In some cases we would get better code with crossjumping >too, but it does not seem critical. > >profiledbootstrapped/regtested x86_64, ltoprofiledbootstrapped x86_64 >and also >profiledbootstrapped on power. I am running firefox build with LTO+FDO >on >x86_64 too and regtesting on power. OK?
OK. Richard. >Honza > > PR rtl/84058 > * cfgcleanup.c (try_forward_edges): Do not give up on crossing > jumps; choose last target that matches the criteria (i.e. > no partition changes for non-crossing jumps). > * cfgrtl.c (cfg_layout_redirect_edge_and_branch): Add basic > support for redirecting crossing jumps to non-crossing. >Index: cfgcleanup.c >=================================================================== >--- cfgcleanup.c (revision 259167) >+++ cfgcleanup.c (working copy) >@@ -394,19 +394,6 @@ > edge_iterator ei; > edge e, *threaded_edges = NULL; > >- /* If we are partitioning hot/cold basic blocks, we don't want to >- mess up unconditional or indirect jumps that cross between hot >- and cold sections. >- >- Basic block partitioning may result in some jumps that appear to >- be optimizable (or blocks that appear to be mergeable), but which >really >- must be left untouched (they are required to make it safely >across >- partition boundaries). See the comments at the top of >- bb-reorder.c:partition_hot_cold_basic_blocks for complete >details. */ >- >- if (JUMP_P (BB_END (b)) && CROSSING_JUMP_P (BB_END (b))) >- return false; >- > for (ei = ei_start (b->succs); (e = ei_safe_edge (ei)); ) > { > basic_block target, first; >@@ -415,6 +402,7 @@ > bool threaded = false; > int nthreaded_edges = 0; > bool may_thread = first_pass || (b->flags & BB_MODIFIED) != 0; >+ bool new_target_threaded = false; > > /* Skip complex edges because we don't know how to update them. > >@@ -431,29 +419,12 @@ > counter = NUM_FIXED_BLOCKS; > goto_locus = e->goto_locus; > >- /* If we are partitioning hot/cold basic_blocks, we don't want >to mess >- up jumps that cross between hot/cold sections. >- >- Basic block partitioning may result in some jumps that appear >- to be optimizable (or blocks that appear to be mergeable), but which >- really must be left untouched (they are required to make it safely >- across partition boundaries). See the comments at the top of >- bb-reorder.c:partition_hot_cold_basic_blocks for complete >- details. */ >- >- if (first != EXIT_BLOCK_PTR_FOR_FN (cfun) >- && JUMP_P (BB_END (first)) >- && CROSSING_JUMP_P (BB_END (first))) >- return changed; >- > while (counter < n_basic_blocks_for_fn (cfun)) > { > basic_block new_target = NULL; >- bool new_target_threaded = false; > may_thread |= (target->flags & BB_MODIFIED) != 0; > > if (FORWARDER_BLOCK_P (target) >- && !(single_succ_edge (target)->flags & EDGE_CROSSING) > && single_succ (target) != EXIT_BLOCK_PTR_FOR_FN (cfun)) > { > /* Bypass trivial infinite loops. */ >@@ -543,8 +514,14 @@ > break; > > counter++; >- target = new_target; >- threaded |= new_target_threaded; >+ /* Do not turn non-crossing jump to crossing. Depending on target >+ it may require different instruction pattern. */ >+ if ((e->flags & EDGE_CROSSING) >+ || BB_PARTITION (first) == BB_PARTITION (new_target)) >+ { >+ target = new_target; >+ threaded |= new_target_threaded; >+ } > } > > if (counter >= n_basic_blocks_for_fn (cfun)) >Index: cfgrtl.c >=================================================================== >--- cfgrtl.c (revision 259167) >+++ cfgrtl.c (working copy) >@@ -4361,6 +4361,18 @@ > if (e->dest == dest) > return e; > >+ if (e->flags & EDGE_CROSSING >+ && BB_PARTITION (e->src) == BB_PARTITION (dest) >+ && simplejump_p (BB_END (src))) >+ { >+ if (dump_file) >+ fprintf (dump_file, >+ "Removing crossing jump while redirecting edge form %i to >%i\n", >+ e->src->index, dest->index); >+ delete_insn (BB_END (src)); >+ e->flags |= EDGE_FALLTHRU; >+ } >+ > if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun) > && (ret = try_redirect_by_replacing_jump (e, dest, true))) > { >@@ -4424,8 +4436,9 @@ > else > ret = redirect_branch_edge (e, dest); > >+ fixup_partition_crossing (ret); > /* We don't want simplejumps in the insn stream during cfglayout. */ >- gcc_assert (!simplejump_p (BB_END (src))); >+ gcc_assert (!simplejump_p (BB_END (src)) || CROSSING_JUMP_P (BB_END >(src))); > > df_set_bb_dirty (src); > return ret;