Re: [patch] jump threading multiple paths that start from the same BB

Richard Biener Wed, 04 Jul 2018 02:04:19 -0700

On Wed, Jul 4, 2018 at 10:12 AM Aldy Hernandez <al...@redhat.com> wrote:
>
>
>
> On 07/03/2018 08:16 PM, Jeff Law wrote:
> > On 07/03/2018 03:31 AM, Aldy Hernandez wrote:
> >> On 07/02/2018 07:08 AM, Christophe Lyon wrote:
> >>
> >>>>> On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
> >>>>>> While poking around in the backwards threader I noticed that we bail if
> >>>>>>
> >>>>>> we have already seen a starting BB.
> >>>>>>
> >>>>>>          /* Do not jump-thread twice from the same block.  */
> >>>>>>          if (bitmap_bit_p (threaded_blocks, entry->src->index)
> >>>>>>
> >>>>>> This limitation discards paths that are sub-paths of paths that have
> >>>>>> already been threaded.
> >>>>>>
> >>>>>> The following patch scans the remaining to-be-threaded paths to 
> >>>>>> identify
> >>>>>>
> >>>>>> if any of them start from the same point, and are thus sub-paths of the
> >>>>>>
> >>>>>> just-threaded path.  By removing the common prefix of blocks in 
> >>>>>> upcoming
> >>>>>>
> >>>>>> threadable paths, and then rewiring first non-common block
> >>>>>> appropriately, we expose new threading opportunities, since we are no
> >>>>>> longer starting from the same BB.  We also simplify the would-be
> >>>>>> threaded paths, because we don't duplicate already duplicated paths.
> >> [snip]
> >>> Hi,
> >>>
> >>> I've noticed a regression on aarch64:
> >>> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
> >>> threaded: 3"
> >>> very likely caused by this patch (appeared between 262282 and 262294)
> >>>
> >>> Christophe
> >>
> >> The test needs to be adjusted here.
> >>
> >> The long story is that the aarch64 IL is different at thread3 time in
> >> that it has 2 profitable sub-paths that can now be threaded with my
> >> patch.  This is causing the threaded count to be 5 for aarch64, versus 3
> >> for x86 64.  Previously we couldn't thread these in aarch64, so the
> >> backwards threader would bail.
> >>
> >> One can see the different threading opportunities by sticking
> >> debug_all_paths() at the top of thread_through_all_blocks().  You will
> >> notice that aarch64 has far more candidates to begin with.  The IL on
> >> the x86 backend, has no paths that start on the same BB.  The aarch64,
> >> on the other hand, has many to choose from:
> >>
> >> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
> >> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> >> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> >> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> >> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> >> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> >> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> >> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> >> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
> >>
> >> Some of these prove unprofitable, but 2 more than before are profitable 
> >> now.
> >>
> >>
> >> BTW, I see another threading related failure on aarch64 which is
> >> unrelated to my patch, and was previously there:
> >>
> >> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
> >> threaded"
> >>
> >> This is probably another IL incompatibility between architectures.
> >>
> >> Anyways... the attached path fixes the regression.  I have added a note
> >> to the test explaining the IL differences.  We really should rewrite all
> >> the threading tests (I am NOT volunteering ;-)).
> >>
> >> OK for trunk?
> >> Aldy
> >>
> >> curr.patch
> >>
> >>
> >> gcc/testsuite/
> >>
> >>      * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
> >>      has a slightly different IL that provides more threading
> >>      opportunities.
> > OK.
> >
> > WRT rewriting the tests.  I'd certainly agree that we don't have the
> > right set of knobs to allow us to characterize the target nor do we have
> > the right dumping/scanning facilities to describe and query the CFG changes.
> >
> > The fact that the IL changes so much across targets is a sign that
> > target dependency (probably BRANCH_COST) is twiddling the gimple we
> > generate.  I strongly suspect we'd be a lot better off if we tackled the
> > BRANCH_COST problem first.
>
> Huh.  I've always accepted differing IL between architectures as a
> necessary evil for things like auto-vectorization and the like.
>
> What's the ideal plan here? A knob to set default values for target
> dependent variables that can affect IL layout?  Then we could pass
> -fthis-is-an-IL-test and things be normalized?


Use gimple testcases.  It's currently a bit awkward in some cases
but it worked for me in a few cases.

Richard.

> Thanks.
> Aldy

Re: [patch] jump threading multiple paths that start from the same BB

Reply via email to