Re: [Patch] Improving jump-thread pass for PR 54742

Jeff Law Mon, 24 Nov 2014 13:02:00 -0800

On 11/23/14 15:22, Sebastian Pop wrote:

The second patch attached limits the search for FSM jump threads to loops.  With
that patch, we are now down to 470 jump threads in an x86_64-linux bootstrap
(and 424 jump threads on powerpc64-linux bootstrap.)

Yea, that was one of the things I was going to poke at as well as aquick scan of your patch gave me the impression it wasn't limited to loops.

Again, I haven't looked much at the patch, but I got the impressionyou're doing a backwards walk through the predecessors to discover theresult of the COND_EXPR. Correct?

That's something I'd been wanting to do -- basically start with aCOND_EXPR, then walk the dataflow backwards substituting values into theCOND_EXPR (possibly creating non-gimple). Ultimately the goal is tosubstitute and fold, getting to a constant :-)

The forward exhaustive stuff we do now is, crazy. The backwardsapproach could be decoupled from DOM & VRP into an independent pass,which I think would be wise.

Using a SEME region copier is also something I really wanted to do longterm. In fact, I believe a lot of tree-ssa-threadupdate.c ought to beripped out and replaced with a SEME based copier.

It appears you've built at least parts of two pieces needed to all thisas a Bodik style optimizer. Which is exactly the long term direction Ithink this code ought to take.


One of the reasons I think we see more branches is that in sese region copying 
we
do not use the knowledge of the value of the condition for the last branch in a
jump-thread path: we rely on other propagation passes to remove the branch.  The
last attached patch adds:

   /* Remove the last branch in the jump thread path.  */
   remove_ctrl_stmt_and_useless_edges (region_copy[n_region - 1], exit->dest);

That's certainly a possibility. But I would expect that even with thislimitation something would be picking up the fact that the branch isstatically computable (even if it's an RTL optimizer). But it'sdefinitely something to look for.


Please let me know if the attached patches are producing better results on gcc.


For the trunk:
  instructions:1339016494968
  branches     :243568982489

First version of your patch:

  instructions:1339739533291
  branches:     243806615986

Latest version of your patch:

  instructions:1339749122609
  branches:     243809838262

Which is in the noise for this test. Which makes me wonder if I botchedsomething on the latest run. It doesn't appear so, but I'm re-runningjust to be sure. I'm also turning on -g so that I can use cg_annotateto poke a bit deeper and perhaps identify one or more concrete exampleswhere your patch is making this worse.


Jeff

Re: [Patch] Improving jump-thread pass for PR 54742

Reply via email to