On Mon, Jul 8, 2019 at 4:41 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> writes:
> > On Sun, Jul 7, 2019 at 9:07 PM Jeff Law <l...@redhat.com> wrote:
> >>
> >> On 7/7/19 3:45 AM, Richard Sandiford wrote:
> >> > DCE tries to delete dead stores to local data and also tries to insert
> >> > debug binds for simple cases:
> >> >
> >> >   /* If this is a store into a variable that is being optimized away,
> >> >      add a debug bind stmt if possible.  */
> >> >   if (MAY_HAVE_DEBUG_BIND_STMTS
> >> >       && gimple_assign_single_p (stmt)
> >> >       && is_gimple_val (gimple_assign_rhs1 (stmt)))
> >> >     {
> >> >       tree lhs = gimple_assign_lhs (stmt);
> >> >       if ((VAR_P (lhs) || TREE_CODE (lhs) == PARM_DECL)
> >> >         && !DECL_IGNORED_P (lhs)
> >> >         && is_gimple_reg_type (TREE_TYPE (lhs))
> >> >         && !is_global_var (lhs)
> >> >         && !DECL_HAS_VALUE_EXPR_P (lhs))
> >> >       {
> >> >         tree rhs = gimple_assign_rhs1 (stmt);
> >> >         gdebug *note
> >> >           = gimple_build_debug_bind (lhs, unshare_expr (rhs), stmt);
> >> >         gsi_insert_after (i, note, GSI_SAME_STMT);
> >> >       }
> >> >     }
> >> >
> >> > But this doesn't help for things like "print *ptr" when ptr points
> >> > to the local variable (tests Og-dce-1.c and Og-dce-2.c).  It also tends
> >> > to make the *live* -- and thus useful -- values optimised out, because
> >> > we can't yet switch back to tracking the memory location as it evolves
> >> > over time (test Og-dce-3.c).
> >> >
> >> > So for -Og I think it'd be better not to delete any stmts with
> >> > vdefs for now.  This also means that we can avoid the potentially
> >> > expensive vop walks (which already have a cut-off, but still).
> >> >
> >> > The patch also fixes the Og failures in gcc.dg/guality/pr54970.c
> >> > (PR 86638).
> >> >
> >> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >> >
> >> > Richard
> >> >
> >> >
> >> > 2019-07-07  Richard Sandiford  <richard.sandif...@arm.com>
> >> >
> >> > gcc/
> >> >       PR debug/86638
> >> >       * tree-ssa-dce.c (keep_all_vdefs_p): New function.
> >> >       (mark_stmt_if_obviously_necessary): Mark all stmts with vdefs as
> >> >       necessary if keep_all_vdefs_p is true.
> >> >       (mark_aliased_reaching_defs_necessary): Add a gcc_checking_assert
> >> >       that keep_all_vdefs_p is false.
> >> >       (mark_all_reaching_defs_necessary): Likewise.
> >> >       (propagate_necessity): Skip the vuse scan if keep_all_vdefs_p is 
> >> > true.
> >> >
> >> > gcc/testsuite/
> >> >       * c-c++-common/guality/Og-dce-1.c: New test.
> >> >       * c-c++-common/guality/Og-dce-2.c: Likewise.
> >> >       * c-c++-common/guality/Og-dce-3.c: Likewise.
> >> OK
> >
> > I wonder how code size (and compile-time) is affected by the DSE/DCE patch?
> > Say just look at -Og built cc1?
>
> Overall I see a ~2.5% slowdown and a 4.7% increase in load size.
> That comes almost entirely from the (RTL) DSE side; this patch
> and gimple DSE part don't seem to make much difference.
>
> If I keep the gimple passes as-is and just disable RTL DSE, the slowdown
> is still ~2.5% and there's a 4.4% increase in load size.
>
> These are all measuring cc1plus (built from post-patch sources)
> and using -O2 -g tree-into-ssa.ii for the speed checks.
>
> > Can you restrict the keep-all-vdefs to user variables (and measure the
> > difference this makes)?
>
> In order to avoid wrong debug for pointer dereferences, I think it would
> have to be keep-all-vdefs for writes to either user variables or unknown
> locations.  But as above, I can't measure a significant difference with
> the patch.
>
> > Again I wonder if this makes C++ with -Og impractical runtime-wise.
>
> Got a particular test in mind?

Nothing specific - there are a few C/C++ benchmarks in SPEC and there's
also tramp3d-v4.  I guess SRA is much more important for the abstraction
penalty than DSE - FRE should be able to remove the abstraction, just the
dead stores will remain (but they'd probably nicely execute out-of-order).

Anyway, the biggest runtime penalty from -Og is probably not running
any loop optimization (invariant motion mostly).

Richard.

>
> Richard

Reply via email to