Re: Basic kill analysis for modref

Richard Biener via Gcc-patches Fri, 12 Nov 2021 02:39:11 -0800

On Thu, 11 Nov 2021, Jan Hubicka wrote:

> Hi,
> This patch enables optimization of stores that are killed by calls.
> Modref summary is extended by array containing list of access ranges, relative
> to function parameters, that are known to be killed by the function.
> This array is collected during local analysis and optimized (so separate
> stores are glued together). 
> 
> Kill analysis in ipa-modref.c is quite simplistic.  In particular no WPA
> propagation is done and also we take very simple approach to prove that
> given store is executed each invocation of the function.  I simply
> require it to be in the first basic block and before anyting that can
> throw externally.  I have more fancy code for that but with this patch I
> want to primarily discuss interace to tree-ssa-alias.c. I wonder if thre
> are some helpers I can re-use?
> 
> From GCC linktime I get 814 functions with non-empty kill vector.
> 
> Modref stats:
>   modref kill: 39 kills, 7162 queries
>   modref use: 25169 disambiguations, 697722 queries
>   modref clobber: 2290122 disambiguations, 22750147 queries
>   5240008 tbaa queries (0.230329 per modref query)
>   806190 base compares (0.035437 per modref query)
> 
> (note that more kills happens at early optimization where we did not
> inlined that much yet).
> 
> For tramp3d (non-lto -O3 build):
> 
> Modref stats:
>   modref kill: 45 kills, 630 queries
>   modref use: 750 disambiguations, 10061 queries
>   modref clobber: 35253 disambiguations, 543262 queries
>   85347 tbaa queries (0.157101 per modref query)
>   18727 base compares (0.034471 per modref query)
> 
> So it is not that high, but it gets better after improving the analysis side
> and also with -Os and/or PGO (wehre we offline cdtors) and also wiring in
> same_addr_size_stores_p which I want to discuss incrementally.
> 
> But at least there are not that many queries to slow down compile times
> noticeably :)
> 
> Honza
> 
> gcc/ChangeLog:
> 
>       * ipa-modref-tree.h (struct modref_access_node): New member function
>       * ipa-modref.c (modref_summary::useful_p): Kills are not useful when
>       we can not analyze loads.
>       (struct modref_summary_lto): Add kills.
>       (modref_summary::dump): Dump kills.
>       (record_access): Take access node as parameter.
>       (record_access_lto): Likewise.
>       (add_kill): New function.
>       (merge_call_side_effects): Merge kills.
>       (analyze_call): Pass around always_executed.
>       (struct summary_ptrs): Add always_executed flag.
>       (analyze_load): Update.
>       (analyze_store): Handle kills.
>       (analyze_stmt): Pass around always_executed flag; handle kills from
>       clobbers.
>       (analyze_function): Compute always_executed.
>       (modref_summaries::duplicate): Copy kills.
>       (update_signature): Release kills.
>       * ipa-modref.h (struct modref_summary): Add kills.
>       * tree-ssa-alias.c (dump_alias_stats): Dump kills.
>       (stmt_kills_ref_p): Handle modref kills.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.dg/tree-ssa/modref-dse-2.c: New test.
> 
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 17ff6bb582c..6f8caa331a6 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -120,6 +120,8 @@ static struct {
>    unsigned HOST_WIDE_INT modref_use_no_alias;
>    unsigned HOST_WIDE_INT modref_clobber_may_alias;
>    unsigned HOST_WIDE_INT modref_clobber_no_alias;
> +  unsigned HOST_WIDE_INT modref_kill_no;
> +  unsigned HOST_WIDE_INT modref_kill_yes;
>    unsigned HOST_WIDE_INT modref_tests;
>    unsigned HOST_WIDE_INT modref_baseptr_tests;
>  } alias_stats;
> @@ -169,6 +171,12 @@ dump_alias_stats (FILE *s)
>          + alias_stats.aliasing_component_refs_p_may_alias);
>    dump_alias_stats_in_alias_c (s);
>    fprintf (s, "\nModref stats:\n");
> +  fprintf (s, "  modref kill: "
> +        HOST_WIDE_INT_PRINT_DEC" kills, "
> +        HOST_WIDE_INT_PRINT_DEC" queries\n",
> +        alias_stats.modref_kill_yes,
> +        alias_stats.modref_kill_yes
> +        + alias_stats.modref_kill_no);
>    fprintf (s, "  modref use: "
>          HOST_WIDE_INT_PRINT_DEC" disambiguations, "
>          HOST_WIDE_INT_PRINT_DEC" queries\n",
> @@ -3373,6 +3381,107 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
>    if (is_gimple_call (stmt))
>      {
>        tree callee = gimple_call_fndecl (stmt);
> +      struct cgraph_node *node;
> +      modref_summary *summary;
> +
> +      /* Try to disambiguate using modref summary.  Modref records a vector
> +      of stores with known offsets relative to function parameters that must
> +      happen every execution of function.  Find if we have a matching
> +      store and verify that function can not use the value.  */
> +      if (callee != NULL_TREE
> +       && (node = cgraph_node::get (callee)) != NULL
> +       && node->binds_to_current_def_p ()
> +       && (summary = get_modref_function_summary (node)) != NULL


I wonder why we bother producing summaries for things that do not
bind locally?  The summary->kills.length () has an upper bound?

> +       && summary->kills.length ())
> +     {
> +       tree base = ao_ref_base (ref);
> +       for (unsigned int i = 0; i < summary->kills.length (); i++)
> +         {
> +           modref_access_node &a = summary->kills[i];
> +           tree op;
> +           poly_offset_int off1_adj = 0, off2_adj = 0;
> +           poly_int64 off1, off2;
> +           tree base_ptr = NULL;
> +           tree base_decl = NULL;
> +
> +           if (a.parm_index >= 0)
> +             op = gimple_call_arg (stmt, a.parm_index);
> +           else if (a.parm_index == MODREF_STATIC_CHAIN_PARM)
> +             op = gimple_call_chain (stmt);
> +           else
> +             gcc_unreachable ();

I wonder if we can abstract this to a modref_access_node method?

> +
> +           off2_adj += a.parm_offset * BITS_PER_UNIT;

wasn't there a parm_offset unknown? ...

> +           if (!(off2_adj + a.offset).to_shwi (&off2))
> +             continue;
> +           if (TREE_CODE (base) == MEM_REF)
> +             {
> +               off1_adj = mem_ref_offset (base) << LOG2_BITS_PER_UNIT;
> +               if (TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> +                 base_decl = TREE_OPERAND (TREE_OPERAND (base, 0), 0);

'base' will be the decl in this case, apart from when the constant
offset doesn't fit ao_ref.offset, so I think you can spare this
special-case and give up on non-SSA base_ptr

> +               else
> +                 base_ptr = TREE_OPERAND (base, 0);
> +             }
> +           /* Give up on TMRs for now.  */
> +           else if (TREE_CODE (base) == TARGET_MEM_REF)
> +             break;
> +           else
> +             base_decl = base;
> +
> +           gcc_checking_assert (!base_decl || DECL_P (base_decl));
> +           gcc_checking_assert (!base_ptr
> +                                || TREE_CODE (base_ptr) == SSA_NAME);
> +
> +           /* OP is a pointer and we have access range from its
> +              dereference.  */
> +           if (TREE_CODE (op) == ADDR_EXPR)
> +             {
> +               poly_int64 size, offset, max_size;
> +               bool reverse;
> +               tree op_base = get_ref_base_and_extent
> +                       (TREE_OPERAND (op, 0), &offset, &size,
> +                        &max_size, &reverse);

I think you want get_addr_base_and_unit_offset here.  All
variable indexed addresses are in separate stmts.  That also means
you can eventually work with just byte sizes/offsets?

> +               if (!known_size_p (size) || !known_eq (size, max_size))
> +                 continue;
> +               off2_adj += offset;
> +               /* &str->foo are not passed as gimple operands directly,
> +                  would need to look up the def stmt.  */
> +               gcc_checking_assert (TREE_CODE (op_base) != MEM_REF);
> +               if (!base_decl
> +                   || compare_base_decls (op_base, base_decl) != 1)
> +                 continue;
> +             }
> +           else if (!base_ptr || !operand_equal_p (base_ptr, op))
> +             continue;
> +
> +           if (!(off1_adj + ref->offset).to_shwi (&off1))
> +             continue;
> +           if (!(off2_adj + a.offset).to_shwi (&off2))
> +             continue;
> +
> +           if (known_subrange_p (off1, ref->max_size, off2, a.size)
> +               && dbg_cnt (ipa_mod_ref))
> +             {
> +               /* For store to be killed it needs to not be used earlier.  */
> +               if (ref_maybe_used_by_call_p_1 (as_a <gcall *> (stmt), ref,
> +                                               true))

Hmm, so moderf says p->x is killed when we have

foo (struct X *p)
{
  int tem = p->x;
  p->x = 0;
  return tem;
}

?  Or even

foo (struct X *p)
{
  bar ();
  p->x = 0;
}

?

Otherwise it looks sensible.

Richard.

> +                 break;
> +               if (dump_file && (dump_flags & TDF_DETAILS))
> +                 {
> +                   fprintf (dump_file,
> +                            "ipa-modref: call stmt ");
> +                   print_gimple_stmt (dump_file, stmt, 0);
> +                   fprintf (dump_file,
> +                            "ipa-modref: call to %s kills ",
> +                            node->dump_name ());
> +                   print_generic_expr (dump_file, ref->base);
> +                 }
> +               ++alias_stats.modref_kill_yes;
> +               return true;
> +             }
> +         }
> +       ++alias_stats.modref_kill_no;
> +     }
>        if (callee != NULL_TREE
>         && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
>       switch (DECL_FUNCTION_CODE (callee))
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 3e213b23d79..dc3b910801d 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -287,6 +287,55 @@ struct GTY(()) modref_access_node
>             size, max_size, record_adjustments);
>        return true;
>      }
> +  /* Merge in access A if it is possible to do without losing
> +     precision.  Return true if successful.
> +     Unlike merge assume that both accesses are always executed
> +     and merge size the same was as max_size.  */
> +  bool merge_for_kills (const modref_access_node &a)
> +    {
> +      poly_int64 offset1 = 0;
> +      poly_int64 aoffset1 = 0;
> +      poly_int64 new_parm_offset = 0;
> +
> +      /* We assume that containment was tested earlier.  */
> +      gcc_checking_assert (!contains (a) && !a.contains (*this));
> +      /* For kills we allways need to know parameter.  */
> +      gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
> +                        && a.parm_index != MODREF_UNKNOWN_PARM);
> +      if (parm_index != a.parm_index)
> +     return false;
> +      /* Unknown offsets are useless.  */
> +      gcc_checking_assert (parm_offset_known && a.parm_offset_known);
> +      if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
> +     return false;
> +      /* Ranges must be known and sizes matching max sizes.  */
> +      gcc_checking_assert (range_info_useful_p ()
> +                       && a.range_info_useful_p ()
> +                       && known_size_p (size) && known_eq (size, max_size)
> +                       && known_size_p (a.size)
> +                       && known_eq (a.size, a.max_size));
> +      if (known_le (offset1, aoffset1))
> +     {
> +       if (!known_size_p (max_size)
> +           || known_ge (offset1 + max_size, aoffset1))
> +         {
> +           update_for_kills (new_parm_offset, offset1, max_size,
> +                             aoffset1, a.max_size);
> +           return true;
> +         }
> +     }
> +      else if (known_le (aoffset1, offset1))
> +     {
> +       if (!known_size_p (a.max_size)
> +           || known_ge (aoffset1 + a.max_size, offset1))
> +         {
> +           update_for_kills (new_parm_offset, offset1, max_size,
> +                             aoffset1, a.max_size);
> +           return true;
> +         }
> +     }
> +      return false;
> +    }
>    /* Return true if A1 and B1 can be merged with lower informatoin
>       less than A2 and B2.
>       Assume that no containment or lossless merging is possible.  */
> @@ -430,6 +479,33 @@ private:
>        update (parm_offset1, offset1,
>             new_size, new_max_size, record_adjustments);
>      }
> +  /* Merge two ranges both starting at parm_offset1 and update THIS
> +     with result.  */
> +  void update_for_kills (poly_int64 parm_offset1,
> +                      poly_int64 offset1,
> +                      poly_int64 max_size1,
> +                      poly_int64 offset2,
> +                      poly_int64 max_size2)
> +    {
> +      if (known_le (offset1, offset2))
> +     ;
> +      else if (known_le (offset2, offset1))
> +     {
> +       std::swap (offset1, offset2);
> +       std::swap (max_size1, max_size2);
> +     }
> +      else
> +     gcc_unreachable ();
> +
> +      poly_int64 new_max_size = max_size2 + offset2 - offset1;
> +      if (known_le (new_max_size, max_size1))
> +     new_max_size = max_size1;
> +
> +      parm_offset = parm_offset1;
> +      offset = offset1;
> +      max_size = new_max_size;
> +      size = new_max_size;
> +    }
>    /* Given access nodes THIS and A, return true if they
>       can be done with common parm_offsets.  In this case
>       return parm offset in new_parm_offset, new_offset
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index f8b7b900527..d9859daa18e 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -334,6 +334,8 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
>      return false;
>    if (loads && !loads->every_base)
>      return true;
> +  else
> +    kills.release ();
>    if (ecf_flags & ECF_PURE)
>      return false;
>    return stores && !stores->every_base;
> @@ -370,6 +372,7 @@ struct GTY(()) modref_summary_lto
>       more verbose and thus more likely to hit the limits.  */
>    modref_records_lto *loads;
>    modref_records_lto *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> @@ -617,6 +620,12 @@ modref_summary::dump (FILE *out)
>        fprintf (out, "  stores:\n");
>        dump_records (stores, out);
>      }
> +  if (kills.length ())
> +    {
> +      fprintf (out, "  kills:\n");
> +      for (unsigned int i = 0; i < kills.length (); i++)
> +     dump_access (&kills[i], out);
> +    }
>    if (writes_errno)
>      fprintf (out, "  Writes errno\n");
>    if (side_effects)
> @@ -750,13 +759,12 @@ get_access (ao_ref *ref)
>  /* Record access into the modref_records data structure.  */
>  
>  static void
> -record_access (modref_records *tt, ao_ref *ref)
> +record_access (modref_records *tt, ao_ref *ref, modref_access_node &a)
>  {
>    alias_set_type base_set = !flag_strict_aliasing ? 0
>                           : ao_ref_base_alias_set (ref);
>    alias_set_type ref_set = !flag_strict_aliasing ? 0
>                           : (ao_ref_alias_set (ref));
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i ",
> @@ -769,7 +777,7 @@ record_access (modref_records *tt, ao_ref *ref)
>  /* IPA version of record_access_tree.  */
>  
>  static void
> -record_access_lto (modref_records_lto *tt, ao_ref *ref)
> +record_access_lto (modref_records_lto *tt, ao_ref *ref, modref_access_node 
> &a)
>  {
>    /* get_alias_set sometimes use different type to compute the alias set
>       than TREE_TYPE (base).  Do same adjustments.  */
> @@ -816,7 +824,6 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>                      || variably_modified_type_p (ref_type, NULL_TREE)))
>       ref_type = NULL_TREE;
>      }
> -  modref_access_node a = get_access (ref);
>    if (dump_file)
>      {
>        fprintf (dump_file, "   - Recording base type:");
> @@ -910,6 +917,82 @@ parm_map_for_arg (tree op)
>    return parm_map;
>  }
>  
> +/* Insert new kill A into KILLS.  */
> +
> +static void
> +add_kill (vec<modref_access_node> &kills, modref_access_node &a)
> +{
> +  size_t index;
> +  modref_access_node *a2;
> +  bool merge = false;
> +
> +  /* See if we have corresponding etry already or we can merge with
> +     neighbouring entry.  */
> +  FOR_EACH_VEC_ELT (kills, index, a2)
> +    {
> +      if (a2->contains (a))
> +     return;
> +      if (a.contains (*a2))
> +     {
> +       *a2 = a;
> +       merge = true;
> +       break;
> +     }
> +      if (a2->merge_for_kills (a))
> +     {
> +       merge = true;
> +       break;
> +     }
> +    }
> +  /* If entry was not found, insert it.  */
> +  if (!merge)
> +    {
> +      if ((int)kills.length () >= param_modref_max_accesses)
> +     {
> +       if (dump_file)
> +         fprintf (dump_file,
> +                  "--param param=modref-max-accesses limit reached:");
> +       return;
> +     }
> +      kills.safe_push (a);
> +      return;
> +    }
> +  /* Extenidng range in an entry may make it possible to merge it with
> +     other entries.  */
> +    {
> +      size_t i;
> +
> +      for (i = 0; i < kills.length ();)
> +     if (i != index)
> +       {
> +         bool found = false, restart = false;
> +         modref_access_node *a = &kills[i];
> +         modref_access_node *n = &kills[index];
> +
> +         if (n->contains (*a))
> +           found = true;
> +         if (!found && n->merge_for_kills (*a))
> +           found = restart = true;
> +         gcc_checking_assert (found || !a->merge_for_kills (*n));
> +         if (found)
> +           {
> +             kills.unordered_remove (i);
> +             if (index == kills.length ())
> +               {
> +                 index = i;
> +                 i++;
> +               }
> +             if (restart)
> +               i = 0;
> +           }
> +         else
> +           i++;
> +       }
> +     else
> +       i++;
> +    }
> +}
> +
>  /* Merge side effects of call STMT to function with CALLEE_SUMMARY
>     int CUR_SUMMARY.  Return true if something changed.
>     If IGNORE_STORES is true, do not merge stores.
> @@ -920,7 +1003,7 @@ bool
>  merge_call_side_effects (modref_summary *cur_summary,
>                        gimple *stmt, modref_summary *callee_summary,
>                        bool ignore_stores, cgraph_node *callee_node,
> -                      bool record_adjustments)
> +                      bool record_adjustments, bool always_executed)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    modref_parm_map chain_map;
> @@ -988,6 +1071,29 @@ merge_call_side_effects (modref_summary *cur_summary,
>         changed = true;
>       }
>      }
> +  if (always_executed)
> +    {
> +      size_t i;
> +      modref_access_node *a2;
> +      FOR_EACH_VEC_ELT (callee_summary->kills, i, a2)
> +     {
> +       modref_access_node n = *a2;
> +       if (n.parm_index >= (int)parm_map.length ())
> +         continue;
> +       modref_parm_map &m
> +               = n.parm_index == MODREF_STATIC_CHAIN_PARM
> +                 ? chain_map
> +                 : parm_map[n.parm_index];
> +       if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
> +           || m.parm_index == MODREF_UNKNOWN_PARM
> +           || m.parm_index == MODREF_RETSLOT_PARM
> +           || !m.parm_offset_known)
> +         continue;
> +       n.parm_index = m.parm_index;
> +       n.parm_offset += m.parm_offset;
> +       add_kill (cur_summary->kills, n);
> +     }
> +    }
>    if (!cur_summary->side_effects
>        && callee_summary->side_effects)
>      {
> @@ -1198,7 +1304,8 @@ process_fnspec (modref_summary *cur_summary,
>  
>  static bool
>  analyze_call (modref_summary *cur_summary, modref_summary_lto 
> *cur_summary_lto,
> -           gcall *stmt, vec <gimple *> *recursive_calls)
> +           gcall *stmt, vec <gimple *> *recursive_calls,
> +           bool always_executed)
>  {
>    /* Check flags on the function call.  In certain cases, analysis can be
>       simplified.  */
> @@ -1284,7 +1391,7 @@ analyze_call (modref_summary *cur_summary, 
> modref_summary_lto *cur_summary_lto,
>      }
>  
>    merge_call_side_effects (cur_summary, stmt, callee_summary, ignore_stores,
> -                        callee_node, false);
> +                        callee_node, false, always_executed);
>  
>    return true;
>  }
> @@ -1295,6 +1402,7 @@ struct summary_ptrs
>  {
>    struct modref_summary *nolto;
>    struct modref_summary_lto *lto;
> +  bool always_executed;
>  };
>  
>  /* Helper for analyze_stmt.  */
> @@ -1329,11 +1437,12 @@ analyze_load (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->loads, &r);
> +    record_access (summary->loads, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->loads, &r);
> +    record_access_lto (summary_lto->loads, &r, a);
>    return false;
>  }
>  
> @@ -1369,11 +1478,24 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>    ao_ref r;
>    ao_ref_init (&r, op);
> +  modref_access_node a = get_access (&r);
>  
>    if (summary)
> -    record_access (summary->stores, &r);
> +    record_access (summary->stores, &r, a);
>    if (summary_lto)
> -    record_access_lto (summary_lto->stores, &r);
> +    record_access_lto (summary_lto->stores, &r, a);
> +  if (summary
> +      && ((summary_ptrs *)data)->always_executed
> +      && a.parm_offset_known == true
> +      && a.parm_index != MODREF_UNKNOWN_PARM
> +      && a.parm_index != MODREF_RETSLOT_PARM
> +      && known_size_p (r.size)
> +      && known_eq (r.max_size, r.size))
> +    {
> +      if (dump_file)
> +     fprintf (dump_file, "   - Recording kill\n");
> +      add_kill (summary->kills, a);
> +    }
>    return false;
>  }
>  
> @@ -1382,16 +1504,36 @@ analyze_store (gimple *, tree, tree op, void *data)
>  
>  static bool
>  analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto,
> -           gimple *stmt, bool ipa, vec <gimple *> *recursive_calls)
> +           gimple *stmt, bool ipa, vec <gimple *> *recursive_calls,
> +           bool always_executed)
>  {
>    /* In general we can not ignore clobbers because they are barriers for code
>       motion, however after inlining it is safe to do because local 
> optimization
>       passes do not consider clobbers from other functions.
>       Similar logic is in ipa-pure-const.c.  */
>    if ((ipa || cfun->after_inlining) && gimple_clobber_p (stmt))
> -    return true;
> +    {
> +      if (summary
> +       && always_executed && record_access_p (gimple_assign_lhs (stmt)))
> +     {
> +       ao_ref r;
> +       ao_ref_init (&r, gimple_assign_lhs (stmt));
> +       modref_access_node a = get_access (&r);
> +       if (a.parm_offset_known == true
> +           && a.parm_index != MODREF_UNKNOWN_PARM
> +           && a.parm_index != MODREF_RETSLOT_PARM
> +           && known_size_p (r.size)
> +           && known_eq (r.max_size, r.size))
> +         {
> +           if (dump_file)
> +             fprintf (dump_file, "   - Recording kill\n");
> +           add_kill (summary->kills, a);
> +         }
> +     }
> +      return true;
> +    }
>  
> -  struct summary_ptrs sums = {summary, summary_lto};
> +  struct summary_ptrs sums = {summary, summary_lto, always_executed};
>  
>    /* Analyze all loads and stores in STMT.  */
>    walk_stmt_load_store_ops (stmt, &sums,
> @@ -1420,7 +1562,8 @@ analyze_stmt (modref_summary *summary, 
> modref_summary_lto *summary_lto,
>     case GIMPLE_CALL:
>       if (!ipa || gimple_call_internal_p (stmt))
>         return analyze_call (summary, summary_lto,
> -                         as_a <gcall *> (stmt), recursive_calls);
> +                         as_a <gcall *> (stmt), recursive_calls,
> +                         always_executed);
>       else
>        {
>       attr_fnspec fnspec = gimple_call_fnspec (as_a <gcall *>(stmt));
> @@ -2729,11 +2872,15 @@ analyze_function (function *f, bool ipa)
>    FOR_EACH_BB_FN (bb, f)
>      {
>        gimple_stmt_iterator si;
> +      bool always_executed
> +           = bb == single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->dest;
> +
>        for (si = gsi_start_nondebug_after_labels_bb (bb);
>          !gsi_end_p (si); gsi_next_nondebug (&si))
>       {
>         if (!analyze_stmt (summary, summary_lto,
> -                          gsi_stmt (si), ipa, &recursive_calls)
> +                          gsi_stmt (si), ipa, &recursive_calls,
> +                          always_executed)
>             || ((!summary || !summary->useful_p (ecf_flags, false))
>                 && (!summary_lto
>                     || !summary_lto->useful_p (ecf_flags, false))))
> @@ -2742,6 +2889,9 @@ analyze_function (function *f, bool ipa)
>             collapse_stores (summary, summary_lto);
>             break;
>           }
> +       if (always_executed
> +           && stmt_can_throw_external (cfun, gsi_stmt (si)))
> +         always_executed = false;
>       }
>      }
>  
> @@ -2761,7 +2911,7 @@ analyze_function (function *f, bool ipa)
>                          ignore_stores_p (current_function_decl,
>                                           gimple_call_flags
>                                                (recursive_calls[i])),
> -                        fnode, !first);
> +                        fnode, !first, false);
>             if (!summary->useful_p (ecf_flags, false))
>               {
>                 remove_summary (lto, nolto, ipa);
> @@ -2993,6 +3143,8 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node 
> *dst,
>                        src_data->loads->max_refs,
>                        src_data->loads->max_accesses);
>    dst_data->loads->copy_from (src_data->loads);
> +  dst_data->kills.reserve_exact (src_data->kills.length ());
> +  dst_data->kills.splice (src_data->kills);
>    dst_data->writes_errno = src_data->writes_errno;
>    dst_data->side_effects = src_data->side_effects;
>    if (src_data->arg_flags.length ())
> @@ -3640,6 +3792,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r->loads->remap_params (&map);
>        r->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r->kills.release ();
>        if (r->arg_flags.length ())
>       remap_arg_flags (r->arg_flags, info);
>      }
> @@ -3647,6 +3801,8 @@ update_signature (struct cgraph_node *node)
>      {
>        r_lto->loads->remap_params (&map);
>        r_lto->stores->remap_params (&map);
> +      /* TODO: One we do IPA kills analysis, update the table here.  */
> +      r_lto->kills.release ();
>        if (r_lto->arg_flags.length ())
>       remap_arg_flags (r_lto->arg_flags, info);
>      }
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 49c99f263a7..b77c1aa7400 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -30,6 +30,7 @@ struct GTY(()) modref_summary
>    /* Load and stores in function (transitively closed to all callees)  */
>    modref_records *loads;
>    modref_records *stores;
> +  auto_vec<modref_access_node> GTY((skip)) kills;
>    auto_vec<eaf_flags_t> GTY((skip)) arg_flags;
>    eaf_flags_t retslot_flags;
>    eaf_flags_t static_chain_flags;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> new file mode 100644
> index 00000000000..2a23a9e8ccb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-2.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse1"  } */
> +__attribute__ ((noinline))
> +void write (int *a)
> +{
> +     *a=1;
> +     a[1]=2;
> +}
> +int test ()
> +{
> +     int a;
> +     a=2;
> +     write (&a);
> +     return a;
> +}
> +int test2 (int *a)
> +{
> +     *a=2;
> +     write (a);
> +     return *a;
> +}
> +/* { dg-final { scan-tree-dump-times "Deleted dead store" 2 "dse1"} } */
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Re: Basic kill analysis for modref

Reply via email to