Re: backward threading heuristics tweek

Jeff Law Fri, 05 Aug 2016 14:34:34 -0700

On 06/06/2016 04:19 AM, Jan Hubicka wrote:

Hi,
while looking into profile mismatches introduced by the backward threading pass
I noticed that the heuristics seems quite simplistics.  First it should be
profile sensitive and disallow duplication when optimizing cold paths. Second
it should use estimate_num_insns because gimple statement count is not really
very realistic estimate of final code size effect and third there seems to be
no reason to disable the pass for functions optimized for size.

I've never been happy with the size estimation code -- it's a series ofhacks to address a couple pathological codesize regressions that wereseen when comparing gcc-6 to prior versions. I won't lose any sleep ifwe move to estimate_num_insns and rely more on profile data.


If we block duplication for more than 1 insns for size optimized paths the pass
is able to do majority of threading decisions that are for free and improve 
codegen.
The code size benefit was between 0.5% to 2.7% on testcases I tried (tramp3d,
GCC modules, xlanancbmk and some other stuff around my hd).

Bootstrapped/regtested x86_64-linux, seems sane?

The pass should also avoid calling cleanup_cfg when no trheading was done
and i do not see why it is guarded by expensive_optimizations. What are the
main compile time complexity limitations?

The pass essentially walks up the use-def links in the CFG. WHen itencounters a PHI, we walk up every PHI argument. That's where it getsexpensive. I think there's a better algorithm to do those walks thatdoesn't start at the beginning each time, but I haven't had time toexperiment with coding it up.


Honza

        * tree-ssa-threadbackward.c: Include tree-inline.h
        (profitable_jump_thread_path): Use estimate_num_insns to estimate
        size of copied block; for cold paths reduce duplication.
        (find_jump_threads_backwards): Remove redundant tests.
        (pass_thread_jumps::gate): Enable for -Os.
        * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update testcase.
Index: tree-ssa-threadbackward.c
===================================================================
--- tree-ssa-threadbackward.c   (revision 237101)
+++ tree-ssa-threadbackward.c   (working copy)
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pass.h"
 #include "gimple-ssa.h"
 #include "tree-phinodes.h"
+#include "tree-inline.h"

Probably an indication we want estimate_num_insns broken out into a moregeneric place that could be used by the threader, inlining, etc. Pleaseconsider that as a follow-up.


 static int max_threaded_paths;

@@ -210,7 +211,7 @@ profitable_jump_thread_path (vec<basic_b
                  && !(gimple_code (stmt) == GIMPLE_ASSIGN
                       && gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
                  && !is_gimple_debug (stmt))
-               ++n_insns;
+               n_insns += estimate_num_insns (stmt, &eni_size_weights);
            }

          /* We do not look at the block with the threaded branch
@@ -238,13 +239,15 @@ profitable_jump_thread_path (vec<basic_b
        threaded_through_latch = true;
     }

+  gimple *stmt = get_gimple_control_stmt ((*path)[0]);
+  gcc_assert (stmt);
+
   /* We are going to remove the control statement at the end of the
      last block in the threading path.  So don't count it against our
      statement count.  */
-  n_insns--;

-  gimple *stmt = get_gimple_control_stmt ((*path)[0]);
-  gcc_assert (stmt);
+  n_insns-= estimate_num_insns (stmt, &eni_size_weights);

Whitespace nit here before the "-=".

I think this is fine with the whitespace fix. And again, considermoving estimate_num_insns to a new location outside the inliner.


jeff

Re: backward threading heuristics tweek

Reply via email to