[ Returning to another old patch... ] On 11/07/2017 10:33 AM, Aldy Hernandez wrote: > [One more time, but without rejected HTML mail, because apparently this > is my first post to gcc-patches *ever* ;-)]. > > Howdy! > > While poking around in the backwards threader I noticed that we bail if > we have already seen a starting BB. > > /* Do not jump-thread twice from the same block. */ > if (bitmap_bit_p (threaded_blocks, entry->src->index) > > This limitation discards paths that are sub-paths of paths that have > already been threaded. > > The following patch scans the remaining to-be-threaded paths to identify > if any of them start from the same point, and are thus sub-paths of the > just-threaded path. By removing the common prefix of blocks in upcoming > threadable paths, and then rewiring first non-common block > appropriately, we expose new threading opportunities, since we are no > longer starting from the same BB. We also simplify the would-be > threaded paths, because we don't duplicate already duplicated paths. > > This sounds confusing, but the documentation for the entry point to my > patch (adjust_paths_after_duplication) shows an actual example: > > +/* After an FSM path has been jump threaded, adjust the remaining FSM > + paths that are subsets of this path, so these paths can be safely > + threaded within the context of the new threaded path. > + > + For example, suppose we have just threaded: > + > + 5 -> 6 -> 7 -> 8 -> 12 => 5 -> 6' -> 7' -> 8' -> 12' > + > + And we have an upcoming threading candidate: > + 5 -> 6 -> 7 -> 8 -> 15 -> 20 > + > + This function adjusts the upcoming path into: > + 8' -> 15 -> 20 > > Notice that we will no longer have two paths that start from the same > BB. One will start with bb5, while the adjusted path will start with > bb8'. With this we kill two birds-- we are able to thread more paths, > and these paths will avoid duplicating a whole mess of things that have > already been threaded. > > The attached patch is a subset of some upcoming work that can live on > its own. It bootstraps and regtests. Also, by running it on a handful > of .ii files, I can see that we are able to thread sub-paths that we > previously dropped on the floor. More is better, right? :) > > To test this, I stole Jeff's method of using cachegrind to benchmark > instruction counts and conditional branches > (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02434.html). > > Basically, I bootstrapped two compilers, with and without improvements, > and used each to build a stage1 trunk. Each of these stage1-trunks were > used on 246 .ii GCC files I have lying around from a bootstrap some > random point last year. I used no special flags on builds apart from > --enable-languages=c,c++. > > Although I would've wished a larger improvement, this works comes for > free, as it's just a subset of other work I'm hacking on. > > Without further ado, here are my monumental, earth shattering improvements: > > Conditional branches > Without patch: 411846839709 > With patch: 411831330316 > %changed: -0.0037660% > > Number of instructions > Without patch: 2271844650634 > With patch: 2271761153578 > %changed: -0.0036754% > > > OK for trunk? > Aldy > > p.s. There is a lot of debugging/dumping code in my patch, which I can > gladly remove if/when approved. It helps keep my head straight while > looking at this spaghetti :). > > curr.patch > > > gcc/ > > * tree-ssa-threadupdate.c (mark_threaded_blocks): Avoid > dereferencing path[] beyond its length. > (debug_path): New. > (debug_all_paths): New. > (rewire_first_differing_edge): New. > (adjust_paths_after_duplication): New. > (duplicate_thread_path): Call adjust_paths_after_duplication. > Add new argument. > (thread_through_all_blocks): Add new argument to > duplicate_thread_path. This is fine for the trunk. I'd keep the dumping code as-is. It'll be useful in the future :-)
> > diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c > index 1dab0f1fab4..53ac7181b4b 100644 > --- a/gcc/tree-ssa-threadupdate.c > +++ b/gcc/tree-ssa-threadupdate.c > + > +/* Rewire a jump_thread_edge so that the source block is now a > + threaded source block. > + > + PATH_NUM is an index into the global path table PATHS. > + EDGE_NUM is the jump thread edge number into said path. > + > + Returns TRUE if we were able to successfully rewire the edge. */ > + > +static bool > +rewire_first_differing_edge (unsigned path_num, unsigned edge_num) > +{ > + vec<jump_thread_edge *> *path = paths[path_num]; > + edge &e = (*path)[edge_num]->e; > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, "rewiring edge candidate: %d -> %d\n", > + e->src->index, e->dest->index); > + basic_block src_copy = get_bb_copy (e->src); > + if (src_copy == NULL) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, "ignoring candidate: there is no src COPY\n"); > + return false; > + } > + edge new_edge = find_edge (src_copy, e->dest); > + /* If the previously threaded paths created a flow graph where we > + can no longer figure out where to go, give up. */ > + if (new_edge == NULL) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, "ignoring candidate: we lost our way\n"); > + return false; > + } > + e = new_edge; > + return true; I was at first a bit confused about how this function had any visible side effects. Then I realized that "e" is actually a reference to an edge in the jump threading path and that seemingly dead assignment at the end isn't really dead :-) Jeff