On Sun, 1 Jul 2018 at 12:56, Aldy Hernandez <al...@redhat.com> wrote:
>
>
>
> On 06/29/2018 02:50 PM, Jeff Law wrote:
> > [ Returning to another old patch... ]
> >
> > On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
> >> [One more time, but without rejected HTML mail, because apparently this
> >> is my first post to gcc-patches *ever* ;-)].
> >>
> >> Howdy!
> >>
> >> While poking around in the backwards threader I noticed that we bail if
> >> we have already seen a starting BB.
> >>
> >>        /* Do not jump-thread twice from the same block.  */
> >>        if (bitmap_bit_p (threaded_blocks, entry->src->index)
> >>
> >> This limitation discards paths that are sub-paths of paths that have
> >> already been threaded.
> >>
> >> The following patch scans the remaining to-be-threaded paths to identify
> >> if any of them start from the same point, and are thus sub-paths of the
> >> just-threaded path.  By removing the common prefix of blocks in upcoming
> >> threadable paths, and then rewiring first non-common block
> >> appropriately, we expose new threading opportunities, since we are no
> >> longer starting from the same BB.  We also simplify the would-be
> >> threaded paths, because we don't duplicate already duplicated paths.
> >>
> >> This sounds confusing, but the documentation for the entry point to my
> >> patch (adjust_paths_after_duplication) shows an actual example:
> >>
> >> +/* After an FSM path has been jump threaded, adjust the remaining FSM
> >> +   paths that are subsets of this path, so these paths can be safely
> >> +   threaded within the context of the new threaded path.
> >> +
> >> +   For example, suppose we have just threaded:
> >> +
> >> +   5 -> 6 -> 7 -> 8 -> 12      =>      5 -> 6' -> 7' -> 8' -> 12'
> >> +
> >> +   And we have an upcoming threading candidate:
> >> +   5 -> 6 -> 7 -> 8 -> 15 -> 20
> >> +
> >> +   This function adjusts the upcoming path into:
> >> +   8' -> 15 -> 20
> >>
> >> Notice that we will no longer have two paths that start from the same
> >> BB.  One will start with bb5, while the adjusted path will start with
> >> bb8'.  With this we kill two birds-- we are able to thread more paths,
> >> and these paths will avoid duplicating a whole mess of things that have
> >> already been threaded.
> >>
> >> The attached patch is a subset of some upcoming work that can live on
> >> its own.  It bootstraps and regtests.  Also, by running it on a handful
> >> of .ii files, I can see that we are able to thread sub-paths that we
> >> previously dropped on the floor.  More is better, right? :)
> >>
> >> To test this, I stole Jeff's method of using cachegrind to benchmark
> >> instruction counts and conditional branches
> >> (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02434.html).
> >>
> >> Basically, I bootstrapped two compilers, with and without improvements,
> >> and used each to build a stage1 trunk.  Each of these stage1-trunks were
> >> used on 246 .ii GCC files I have lying around from a bootstrap some
> >> random point last year.  I used no special flags on builds apart from
> >> --enable-languages=c,c++.
> >>
> >> Although I would've wished a larger improvement, this works comes for
> >> free, as it's just a subset of other work I'm hacking on.
> >>
> >> Without further ado, here are my monumental, earth shattering improvements:
> >>
> >> Conditional branches
> >>     Without patch: 411846839709
> >>     With    patch: 411831330316
> >>          %changed: -0.0037660%
> >>
> >> Number of instructions
> >>     Without patch: 2271844650634
> >>     With    patch: 2271761153578
> >>          %changed: -0.0036754%
> >>
> >>
> >> OK for trunk?
> >> Aldy
> >>
> >> p.s. There is a lot of debugging/dumping code in my patch, which I can
> >> gladly remove if/when approved.  It helps keep my head straight while
> >> looking at this spaghetti :).
> >>
> >> curr.patch
> >>
> >>
> >> gcc/
> >>
> >>      * tree-ssa-threadupdate.c (mark_threaded_blocks): Avoid
> >>      dereferencing path[] beyond its length.
> >>      (debug_path): New.
> >>      (debug_all_paths): New.
> >>      (rewire_first_differing_edge): New.
> >>      (adjust_paths_after_duplication): New.
> >>      (duplicate_thread_path): Call adjust_paths_after_duplication.
> >>      Add new argument.
> >>      (thread_through_all_blocks): Add new argument to
> >>      duplicate_thread_path.
> > This is fine for the trunk.  I'd keep the dumping code as-is.  It'll be
> > useful in the future :-)
>
> Retested against current trunk and committed.
>

Hi,

I've noticed a regression on aarch64:
FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
threaded: 3"
very likely caused by this patch (appeared between 262282 and 262294)

Christophe

> Thanks.
>
> Aldy

Reply via email to