Re: Merge stores/loads in modref summaries

Christophe Lyon via Gcc-patches Thu, 26 Aug 2021 01:34:29 -0700

Hi,


On Wed, Aug 25, 2021 at 7:58 PM Jan Hubicka <hubi...@ucw.cz> wrote:

> Hi,
> this patch adds logic needed to merge neighbouring accesses in ipa-modref
> summaries.  This helps analyzing array initializers and similar code.  It
> is
> bit of work, since it breaks the fact that modref tree makes a good
> lattice for
> dataflow: the access ranges can be extended indefinitely.  For this reason
> I
> added counter tracking number of adjustments and a cap to limit them
> during the
> dataflow.  This triggers in:
> void
> recurse (char *p, int n)
> {
>         *p = 0;
>         if (n)
>           recurse (p+1,n-1);
> }
>
> Where we work now hard enugh to determine:
>
> access: Parm 0 param offset:0 offset:0 size:8 max_size:-1 adjusted 8 times
>
> which is correct access info saying that param0 can be accessed from byte 0
> in 8bit accesses with unknown max_size.
>
> --param max-modref-accesses is now hit 8 times instead of 45 before the
> patch.
> We hit --param param=modref-max-adjustments once for fft algorithm (where
> the
> recursion really seems to walk array) and max-bases once for late modref
> and 9
> times for IPA modref (it works on types rather than alias sets so it is
> more
> likely to hit the limit).
>
> I would be happy for suggestions how to simplify the merging logic. It is
> bit
> convoluted since I need to know if I am going to adjust the range and need
> to deal with poly_ints and possibly unknown sizes.
>
> Incrementally I will add logic on improving behaviour when limits are met
> instead of just giving up on the analysis.
>
> With the patch I get following cc1plus stats:
>
> Alias oracle query stats:
>   refs_may_alias_p: 83135089 disambiguations, 101581194 queries
>   ref_maybe_used_by_call_p: 590484 disambiguations, 84157326 queries
>   call_may_clobber_ref_p: 345434 disambiguations, 349295 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 39520 queries
>   nonoverlapping_refs_since_match_p: 33266 disambiguations, 66411 must
> overlaps, 100574 queries
>   aliasing_component_refs_p: 66251 disambiguations, 9920037 queries
>   TBAA oracle: 31033174 disambiguations 93485041 queries
>                14359693 are in alias set 0
>                11930606 queries asked about the same object
>                129 queries asked about the same alias set
>                0 access volatile
>                34218393 are dependent in the DAG
>                1943046 are aritificially in conflict with void *
>
> Modref stats:
>   modref use: 26293 disambiguations, 705198 queries
>   modref clobber: 1828340 disambiguations, 21213011 queries
>   4748965 tbaa queries (0.223870 per modref query)
>   711083 base compares (0.033521 per modref query)
>
> PTA query stats:
>   pt_solution_includes: 13119524 disambiguations, 33183481 queries
>   pt_solutions_intersect: 1510541 disambiguations, 15368102 queries
>
> this would suggest quite large improvement over my last run
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577962.html
> (about 12% on overall disambiguation count)
>
> but I also updated my setup so part of the increase may be accounted for
> diferent libraries.  The overall size of modref access lists is about
> halved on
> cc1plus that looks promising though. It may be that we less often hit the
> limit
> on number of querries done in tree-ssa-alias.
>
> Bootstrapped/regtested x86_64-linux.
> I plan to commit the patch after bit more testing.
>
> gcc/ChangeLog:
>
>         * doc/invoke.texi: Document --param modref-max-adjustments
>         * ipa-modref-tree.c (test_insert_search_collapse): Update.
>         (test_merge): Update.
>         * ipa-modref-tree.h (struct modref_access_node): Add adjustments;
>         (modref_access_node::operator==): Fix handling of access ranges.
>         (modref_access_node::contains): Constify parameter; handle also
>         mismatched parm offsets.
>         (modref_access_node::update): New function.
>         (modref_access_node::merge0: New function.
>         (unspecified_modref_access_node): Update constructor.
>         (modref_ref_node::insert_access): Add record_adjustments parameter;
>         handle merging.
>         (modref_ref_node::try_merge_with): New private function.
>         (modref_tree::insert): New record_adjustments parameter.
>         (modref_tree::merge): New record_adjustments parameter.
>         (modref_tree::copy_from): Update.
>         * ipa-modref.c (dump_access): Dump adjustments field.
>         (get_access): Update constructor.
>         (record_access): Update call of insert.
>         (record_access_lto): Update call of insert.
>         (merge_call_side_effects): Add record_adjustments parameter.
>         (get_access_for_fnspec): Update.
>         (process_fnspec): Update.
>         (analyze_call): Update.
>         (analyze_function): Update.
>         (read_modref_records): Update.
>         (ipa_merge_modref_summary_after_inlining): Update.
>         (propagate_unknown_call): Update.
>         (modref_propagate_in_scc): Update.
>         * params.opt: (param-max-modref-adjustments=): New.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/ipa/modref-1.c: Update testcase.
>         * gcc.dg/tree-ssa/modref-4.c: Update testcase.
>         * gcc.dg/tree-ssa/modref-8.c: New test.
>

This patch is causing ICEs on arm:
 FAIL: g++.dg/torture/pr89303.C   -O1  (internal compiler error)
FAIL: g++.dg/torture/pr89303.C   -O1  (test for excess errors)
Excess errors:
during GIMPLE pass: modref
/gcc/testsuite/g++.dg/torture/pr89303.C:792:1: internal compiler error: in
merge, at ipa-modref-tree.h:203
0xdc9b2b modref_access_node::merge(modref_access_node const&, bool)
        /gcc/ipa-modref-tree.h:203
0xdcbbb9 modref_ref_node<int>::try_merge_with(unsigned long)
        /gcc/ipa-modref-tree.h:397
0xdcc4aa modref_ref_node<int>::insert_access(modref_access_node, unsigned
long, bool)
        /gcc/ipa-modref-tree.h:366
0xdcc71b modref_tree<int>::insert(int, int, modref_access_node, bool)
        /gcc/ipa-modref-tree.h:597
0xdc1312 record_access
        /gcc/ipa-modref.c:713
0xdc1e34 analyze_store
        /gcc/ipa-modref.c:1245
0xd00f2e walk_stmt_load_store_addr_ops(gimple*, void*, bool (*)(gimple*,
tree_node*, tree_node*, void*), bool (*)(gimple*, tree_node*, tree_node*,
void*), bool (*)(gimple*, tree_node*, tree_node*, void*))
        /gcc/gimple-walk.c:767
0xdc6f4a analyze_stmt
        /gcc/ipa-modref.c:1269
0xdc6f4a analyze_function
        /gcc/ipa-modref.c:2131
0xdc860d execute
        /gcc/ipa-modref.c:2957

 FAIL: 20_util/enable_shared_from_this/89303.cc (test for excess errors)
Excess errors:
during GIMPLE pass: modref
/libstdc++-v3/testsuite/20_util/enable_shared_from_this/89303.cc:39:
internal compiler error: in merge, at ipa-modref-tree.h:203
0xdc9b2b modref_access_node::merge(modref_access_node const&, bool)
        /gcc/ipa-modref-tree.h:203
0xdcbbb9 modref_ref_node<int>::try_merge_with(unsigned long)
        /gcc/ipa-modref-tree.h:397
0xdcc4aa modref_ref_node<int>::insert_access(modref_access_node, unsigned
long, bool)
        /gcc/ipa-modref-tree.h:366
0xdcc71b modref_tree<int>::insert(int, int, modref_access_node, bool)
        /gcc/ipa-modref-tree.h:597
0xdc1312 record_access
        /gcc/ipa-modref.c:713
0xdc1e34 analyze_store
        /gcc/ipa-modref.c:1245
0xd00f2e walk_stmt_load_store_addr_ops(gimple*, void*, bool (*)(gimple*,
tree_node*, tree_node*, void*), bool (*)(gimple*, tree_node*, tree_node*,
void*), bool (*)(gimple*, tree_node*, tree_node*, void*))
        /gcc/gimple-walk.c:767
0xdc6f4a analyze_stmt
        /gcc/ipa-modref.c:1269
0xdc6f4a analyze_function
        /gcc/ipa-modref.c:2131
0xdc860d execute
        /gcc/ipa-modref.c:2957

Can you have a look?

thanks,

Christophe





> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b8f5d9e1cce..b83bd902cec 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13423,6 +13423,10 @@ Setting to 0 disables the analysis completely.
>  @item modref-max-escape-points
>  Specifies the maximum number of escape points tracked by modref per
> SSA-name.
>
> +@item modref-max-adjustments
> +Specifies the maximum number the access range is enlarged during modref
> dataflow
> +analysis.
> +
>  @item profile-func-internal-id
>  A parameter to control whether to use function internal id in profile
>  database lookup. If the value is 0, the compiler uses an id that
> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
> index 64e57f52147..69395b0113c 100644
> --- a/gcc/ipa-modref-tree.c
> +++ b/gcc/ipa-modref-tree.c
> @@ -41,7 +41,7 @@ test_insert_search_collapse ()
>    ASSERT_FALSE (t->every_base);
>
>    /* Insert into an empty tree.  */
> -  t->insert (1, 2, a);
> +  t->insert (1, 2, a, false);
>    ASSERT_NE (t->bases, NULL);
>    ASSERT_EQ (t->bases->length (), 1);
>    ASSERT_FALSE (t->every_base);
> @@ -59,7 +59,7 @@ test_insert_search_collapse ()
>    ASSERT_EQ (ref_node->ref, 2);
>
>    /* Insert when base exists but ref does not.  */
> -  t->insert (1, 3, a);
> +  t->insert (1, 3, a, false);
>    ASSERT_NE (t->bases, NULL);
>    ASSERT_EQ (t->bases->length (), 1);
>    ASSERT_EQ (t->search (1), base_node);
> @@ -72,7 +72,7 @@ test_insert_search_collapse ()
>
>    /* Insert when base and ref exist, but access is not dominated by nor
>       dominates other accesses.  */
> -  t->insert (1, 2, a);
> +  t->insert (1, 2, a, false);
>    ASSERT_EQ (t->bases->length (), 1);
>    ASSERT_EQ (t->search (1), base_node);
>
> @@ -80,12 +80,12 @@ test_insert_search_collapse ()
>    ASSERT_NE (ref_node, NULL);
>
>    /* Insert when base and ref exist and access is dominated.  */
> -  t->insert (1, 2, a);
> +  t->insert (1, 2, a, false);
>    ASSERT_EQ (t->search (1), base_node);
>    ASSERT_EQ (base_node->search (2), ref_node);
>
>    /* Insert ref to trigger ref list collapse for base 1.  */
> -  t->insert (1, 4, a);
> +  t->insert (1, 4, a, false);
>    ASSERT_EQ (t->search (1), base_node);
>    ASSERT_EQ (base_node->refs, NULL);
>    ASSERT_EQ (base_node->search (2), NULL);
> @@ -93,7 +93,7 @@ test_insert_search_collapse ()
>    ASSERT_TRUE (base_node->every_ref);
>
>    /* Further inserts to collapsed ref list are ignored.  */
> -  t->insert (1, 5, a);
> +  t->insert (1, 5, a, false);
>    ASSERT_EQ (t->search (1), base_node);
>    ASSERT_EQ (base_node->refs, NULL);
>    ASSERT_EQ (base_node->search (2), NULL);
> @@ -101,13 +101,13 @@ test_insert_search_collapse ()
>    ASSERT_TRUE (base_node->every_ref);
>
>    /* Insert base to trigger base list collapse.  */
> -  t->insert (5, 6, a);
> +  t->insert (5, 6, a, false);
>    ASSERT_TRUE (t->every_base);
>    ASSERT_EQ (t->bases, NULL);
>    ASSERT_EQ (t->search (1), NULL);
>
>    /* Further inserts to collapsed base list are ignored.  */
> -  t->insert (7, 8, a);
> +  t->insert (7, 8, a, false);
>    ASSERT_TRUE (t->every_base);
>    ASSERT_EQ (t->bases, NULL);
>    ASSERT_EQ (t->search (1), NULL);
> @@ -123,22 +123,22 @@ test_merge ()
>    modref_access_node a = unspecified_modref_access_node;
>
>    t1 = new modref_tree<alias_set_type>(3, 4, 1);
> -  t1->insert (1, 1, a);
> -  t1->insert (1, 2, a);
> -  t1->insert (1, 3, a);
> -  t1->insert (2, 1, a);
> -  t1->insert (3, 1, a);
> +  t1->insert (1, 1, a, false);
> +  t1->insert (1, 2, a, false);
> +  t1->insert (1, 3, a, false);
> +  t1->insert (2, 1, a, false);
> +  t1->insert (3, 1, a, false);
>
>    t2 = new modref_tree<alias_set_type>(10, 10, 10);
> -  t2->insert (1, 2, a);
> -  t2->insert (1, 3, a);
> -  t2->insert (1, 4, a);
> -  t2->insert (3, 2, a);
> -  t2->insert (3, 3, a);
> -  t2->insert (3, 4, a);
> -  t2->insert (3, 5, a);
> -
> -  t1->merge (t2, NULL);
> +  t2->insert (1, 2, a, false);
> +  t2->insert (1, 3, a, false);
> +  t2->insert (1, 4, a, false);
> +  t2->insert (3, 2, a, false);
> +  t2->insert (3, 3, a, false);
> +  t2->insert (3, 4, a, false);
> +  t2->insert (3, 5, a, false);
> +
> +  t1->merge (t2, NULL, false);
>
>    ASSERT_FALSE (t1->every_base);
>    ASSERT_NE (t1->bases, NULL);
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index 2e26b75e21f..6f6932f0875 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
>        Again ref is an template to allow LTO streaming.
>     3) Access: this level represent info about individual accesses.
> Presently
>        we record whether access is through a dereference of a function
> parameter
> +      and if so we record the access range.
>  */
>
>  #ifndef GCC_MODREF_TREE_H
> @@ -57,6 +58,9 @@ struct GTY(()) modref_access_node
>       a function parameter.  */
>    int parm_index;
>    bool parm_offset_known;
> +  /* Number of times interval was extended during dataflow.
> +     This has to be limited in order to keep dataflow finite.  */
> +  unsigned char adjustments;
>
>    /* Return true if access node holds no useful info.  */
>    bool useful_p () const
> @@ -84,6 +88,8 @@ struct GTY(()) modref_access_node
>               && !known_eq (parm_offset, a.parm_offset))
>             return false;
>         }
> +      if (range_info_useful_p () != a.range_info_useful_p ())
> +       return false;
>        if (range_info_useful_p ()
>           && (!known_eq (a.offset, offset)
>               || !known_eq (a.size, size)
> @@ -92,16 +98,24 @@ struct GTY(()) modref_access_node
>        return true;
>      }
>    /* Return true A is a subaccess.  */
> -  bool contains (modref_access_node &a) const
> +  bool contains (const modref_access_node &a) const
>      {
> -      if (parm_index != a.parm_index)
> -       return false;
> +      poly_int64 aoffset_adj = 0;
>        if (parm_index >= 0)
>         {
> -         if (parm_offset_known
> -             && (!a.parm_offset_known
> -                 || !known_eq (parm_offset, a.parm_offset)))
> +         if (parm_index != a.parm_index)
>             return false;
> +         if (parm_offset_known)
> +           {
> +              if (!a.parm_offset_known)
> +                return false;
> +              /* Accesses are never below parm_offset, so look
> +                 for smaller offset.  */
> +              if (!known_le (parm_offset, a.parm_offset))
> +                return false;
> +              aoffset_adj = (a.parm_offset - parm_offset)
> +                            << LOG2_BITS_PER_UNIT;
> +           }
>         }
>        if (range_info_useful_p ())
>         {
> @@ -111,20 +125,181 @@ struct GTY(()) modref_access_node
>              to fit the store, so smaller or unknown sotre is more general
>              than large store.  */
>           if (known_size_p (size)
> -             && !known_le (size, a.size))
> +             && (!known_size_p (a.size)
> +                 || !known_le (size, a.size)))
>             return false;
>           if (known_size_p (max_size))
> -           return known_subrange_p (a.offset, a.max_size, offset,
> max_size);
> +           return known_subrange_p (a.offset + aoffset_adj,
> +                                    a.max_size, offset, max_size);
>           else
> -           return known_le (offset, a.offset);
> +           return known_le (offset, a.offset + aoffset_adj);
>         }
>        return true;
>      }
> +  /* Update access range to new parameters.
> +     If RECORD_ADJUSTMENTS is true, record number of changes in the access
> +     and if threshold is exceeded start dropping precision
> +     so only constantly many updates are possible.  This makes dataflow
> +     to converge.  */
> +  void update (poly_int64 parm_offset1,
> +              poly_int64 offset1, poly_int64 size1, poly_int64 max_size1,
> +              bool record_adjustments)
> +    {
> +      if (known_eq (offset, offset1)
> +         && known_eq (size, size1)
> +         && known_eq (max_size, max_size1))
> +       return;
> +      if (!record_adjustments
> +         || (++adjustments) < param_modref_max_adjustments)
> +       {
> +         parm_offset = parm_offset1;
> +         offset = offset1;
> +         size = size1;
> +         max_size = max_size1;
> +       }
> +      else
> +       {
> +         if (dump_file)
> +           fprintf (dump_file,
> +                    "--param param=modref-max-adjustments limit
> reached:");
> +         if (!known_eq (parm_offset, parm_offset1))
> +           {
> +             if (dump_file)
> +               fprintf (dump_file, " parm_offset cleared");
> +             parm_offset_known = false;
> +           }
> +         if (!known_eq (size, size1))
> +           {
> +             size = -1;
> +             if (dump_file)
> +               fprintf (dump_file, " size cleared");
> +           }
> +         if (!known_eq (max_size, max_size1))
> +           {
> +             max_size = -1;
> +             if (dump_file)
> +               fprintf (dump_file, " max_size cleared");
> +           }
> +         if (!known_eq (offset, offset1))
> +           {
> +             offset = 0;
> +             if (dump_file)
> +               fprintf (dump_file, " offset cleared");
> +           }
> +         if (dump_file)
> +           fprintf (dump_file, "\n");
> +       }
> +    }
> +  /* Merge in access A if it is possible to do without losing
> +     precision.  Return true if successful.
> +     If RECORD_ADJUSTMENTs is true, remember how many interval
> +     was prolonged and punt when there are too many.  */
> +  bool merge (const modref_access_node &a, bool record_adjustments)
> +    {
> +      poly_int64 aoffset_adj = 0, offset_adj = 0;
> +      poly_int64 new_parm_offset = parm_offset;
> +
> +      /* We assume that containment was tested earlier.  */
> +      gcc_checking_assert (!contains (a) && !a.contains (*this));
> +      if (parm_index >= 0)
> +       {
> +         if (parm_index != a.parm_index)
> +           return false;
> +         if (parm_offset_known)
> +           {
> +             if (!a.parm_offset_known)
> +               return false;
> +             if (known_le (a.parm_offset, parm_offset))
> +               {
> +                 offset_adj = (parm_offset - a.parm_offset)
> +                               << LOG2_BITS_PER_UNIT;
> +                 aoffset_adj = 0;
> +                 new_parm_offset = a.parm_offset;
> +               }
> +             else if (known_le (parm_offset, a.parm_offset))
> +               {
> +                 aoffset_adj = (a.parm_offset - parm_offset)
> +                                << LOG2_BITS_PER_UNIT;
> +                 offset_adj = 0;
> +               }
> +             else
> +               return false;
> +           }
> +       }
> +      /* See if we can merge ranges.  */
> +      if (range_info_useful_p ())
> +       {
> +         poly_int64 offset1 = offset + offset_adj;
> +         poly_int64 aoffset1 = a.offset + aoffset_adj;
> +
> +         /* In this case we have containment that should be
> +            handled earlier.  */
> +         gcc_checking_assert (a.range_info_useful_p ());
> +
> +         /* If a.size is less specified than size, merge only
> +            if intervals are otherwise equivalent.  */
> +         if (known_size_p (size)
> +             && (!known_size_p (a.size) || known_lt (a.size, size)))
> +           {
> +             if (((known_size_p (max_size) || known_size_p (a.max_size))
> +                  && !known_eq (max_size, a.max_size))
> +                  || !known_eq (offset1, aoffset1))
> +               return false;
> +             update (new_parm_offset, offset1, a.size, max_size,
> +                     record_adjustments);
> +             return true;
> +           }
> +         /* If sizes are same, we can extend the interval.  */
> +         if ((known_size_p (size) || known_size_p (a.size))
> +             && !known_eq (size, a.size))
> +           return false;
> +         if (known_le (offset1, aoffset1))
> +           {
> +             if (!known_size_p (max_size))
> +               {
> +                 update (new_parm_offset, offset1, size, max_size,
> +                         record_adjustments);
> +                 return true;
> +               }
> +             else if (known_ge (offset1 + max_size, aoffset1))
> +               {
> +                 poly_int64 new_max_size = max_size;
> +                 if (known_le (max_size, a.max_size + aoffset1 - offset1))
> +                   new_max_size = a.max_size + aoffset1 - offset1;
> +                 update (new_parm_offset, offset1, size, new_max_size,
> +                         record_adjustments);
> +                 return true;
> +               }
> +           }
> +         else if (known_le (aoffset1, offset1))
> +           {
> +             if (!known_size_p (a.max_size))
> +               {
> +                 update (new_parm_offset, aoffset1, size, a.max_size,
> +                         record_adjustments);
> +                 return true;
> +               }
> +             else if (known_ge (aoffset1 + a.max_size, offset1))
> +               {
> +                 poly_int64 new_max_size = a.max_size;
> +                 if (known_le (a.max_size, max_size + offset1 - aoffset1))
> +                   new_max_size = max_size + offset1 - aoffset1;
> +                 update (new_parm_offset, aoffset1, size, new_max_size,
> +                         record_adjustments);
> +                 return true;
> +               }
> +           }
> +         return false;
> +       }
> +      update (new_parm_offset, offset + offset_adj,
> +             size, max_size, record_adjustments);
> +      return true;
> +    }
>  };
>
>  /* Access node specifying no useful info.  */
>  const modref_access_node unspecified_modref_access_node
> -                = {0, -1, -1, 0, -1, false};
> +                = {0, -1, -1, 0, -1, false, 0};
>
>  template <typename T>
>  struct GTY((user)) modref_ref_node
> @@ -149,8 +324,10 @@ struct GTY((user)) modref_ref_node
>
>    /* Insert access with OFFSET and SIZE.
>       Collapse tree if it has more than MAX_ACCESSES entries.
> +     If RECORD_ADJUSTMENTs is true avoid too many interval extensions.
>       Return true if record was changed.  */
> -  bool insert_access (modref_access_node a, size_t max_accesses)
> +  bool insert_access (modref_access_node a, size_t max_accesses,
> +                     bool record_adjustments)
>    {
>      /* If this base->ref pair has no access information, bail out.  */
>      if (every_access)
> @@ -176,7 +353,17 @@ struct GTY((user)) modref_ref_node
>           return false;
>         if (a.contains (*a2))
>           {
> -           *a2 = a;
> +           a.adjustments = 0;
> +           a2->parm_index = a.parm_index;
> +           a2->parm_offset_known = a.parm_offset_known;
> +           a2->update (a.parm_offset, a.offset, a.size, a.max_size,
> +                       record_adjustments);
> +           try_merge_with (i);
> +           return true;
> +         }
> +       if (a2->merge (a, record_adjustments))
> +         {
> +           try_merge_with (i);
>             return true;
>           }
>         gcc_checking_assert (!(a == *a2));
> @@ -192,9 +379,28 @@ struct GTY((user)) modref_ref_node
>         collapse ();
>         return true;
>        }
> +    a.adjustments = 0;
>      vec_safe_push (accesses, a);
>      return true;
>    }
> +private:
> +  /* Try to optimize the access list after entry INDEX was modified.  */
> +  void
> +  try_merge_with (size_t index)
> +  {
> +    modref_access_node *a2;
> +    size_t i;
> +
> +    FOR_EACH_VEC_SAFE_ELT (accesses, i, a2)
> +      if (i != index)
> +       if ((*accesses)[index].contains (*a2)
> +           || (*accesses)[index].merge (*a2, false))
> +       {
> +         if (index == accesses->length () - 1)
> +           index = i;
> +         accesses->unordered_remove (i);
> +       }
> +  }
>  };
>
>  /* Base of an access.  */
> @@ -342,7 +548,8 @@ struct GTY((user)) modref_tree
>
>    /* Insert memory access to the tree.
>       Return true if something changed.  */
> -  bool insert (T base, T ref, modref_access_node a)
> +  bool insert (T base, T ref, modref_access_node a,
> +              bool record_adjustments)
>    {
>      if (every_base)
>        return false;
> @@ -387,7 +594,8 @@ struct GTY((user)) modref_tree
>        {
>         if (ref_node->every_access)
>           return changed;
> -       changed |= ref_node->insert_access (a, max_accesses);
> +       changed |= ref_node->insert_access (a, max_accesses,
> +                                           record_adjustments);
>         /* See if we failed to add useful access.  */
>         if (ref_node->every_access)
>           {
> @@ -456,7 +664,8 @@ struct GTY((user)) modref_tree
>       PARM_MAP, if non-NULL, maps parm indexes of callee to caller.  -2 is
> used
>       to signalize that parameter is local and does not need to be tracked.
>       Return true if something has changed.  */
> -  bool merge (modref_tree <T> *other, vec <modref_parm_map> *parm_map)
> +  bool merge (modref_tree <T> *other, vec <modref_parm_map> *parm_map,
> +             bool record_accesses)
>    {
>      if (!other || every_base)
>        return false;
> @@ -501,7 +710,8 @@ struct GTY((user)) modref_tree
>                 {
>                   changed |= insert (base_node->base,
>                                      ref_node->ref,
> -                                    unspecified_modref_access_node);
> +                                    unspecified_modref_access_node,
> +                                    record_accesses);
>                 }
>               else
>                 FOR_EACH_VEC_SAFE_ELT (ref_node->accesses, k, access_node)
> @@ -525,7 +735,8 @@ struct GTY((user)) modref_tree
>                                  = (*parm_map) [a.parm_index].parm_index;
>                           }
>                       }
> -                   changed |= insert (base_node->base, ref_node->ref, a);
> +                   changed |= insert (base_node->base, ref_node->ref, a,
> +                                      record_accesses);
>                   }
>             }
>        }
> @@ -537,7 +748,7 @@ struct GTY((user)) modref_tree
>    /* Copy OTHER to THIS.  */
>    void copy_from (modref_tree <T> *other)
>    {
> -    merge (other, NULL);
> +    merge (other, NULL, false);
>    }
>
>    /* Search BASE in tree; return NULL if failed.  */
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index 6ab687a7ba0..0d5ab9c0561 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -426,6 +426,8 @@ dump_access (modref_access_node *a, FILE *out)
>        print_dec ((poly_int64_pod)a->size, out, SIGNED);
>        fprintf (out, " max_size:");
>        print_dec ((poly_int64_pod)a->max_size, out, SIGNED);
> +      if (a->adjustments)
> +       fprintf (out, " adjusted %i times", a->adjustments);
>      }
>    fprintf (out, "\n");
>  }
> @@ -656,7 +658,7 @@ get_access (ao_ref *ref)
>
>    base = ao_ref_base (ref);
>    modref_access_node a = {ref->offset, ref->size, ref->max_size,
> -                         0, -1, false};
> +                         0, -1, false, 0};
>    if (TREE_CODE (base) == MEM_REF || TREE_CODE (base) == TARGET_MEM_REF)
>      {
>        tree memref = base;
> @@ -708,7 +710,7 @@ record_access (modref_records *tt, ao_ref *ref)
>         fprintf (dump_file, "   - Recording base_set=%i ref_set=%i
> parm=%i\n",
>                 base_set, ref_set, a.parm_index);
>      }
> -  tt->insert (base_set, ref_set, a);
> +  tt->insert (base_set, ref_set, a, false);
>  }
>
>  /* IPA version of record_access_tree.  */
> @@ -774,7 +776,7 @@ record_access_lto (modref_records_lto *tt, ao_ref *ref)
>                a.parm_index);
>      }
>
> -  tt->insert (base_type, ref_type, a);
> +  tt->insert (base_type, ref_type, a, false);
>  }
>
>  /* Returns true if and only if we should store the access to EXPR.
> @@ -858,12 +860,15 @@ parm_map_for_arg (gimple *stmt, int i)
>
>  /* Merge side effects of call STMT to function with CALLEE_SUMMARY
>     int CUR_SUMMARY.  Return true if something changed.
> -   If IGNORE_STORES is true, do not merge stores.  */
> +   If IGNORE_STORES is true, do not merge stores.
> +   If RECORD_ADJUSTMENTS is true cap number of adjustments to
> +   a given access to make dataflow finite.  */
>
>  bool
>  merge_call_side_effects (modref_summary *cur_summary,
>                          gimple *stmt, modref_summary *callee_summary,
> -                        bool ignore_stores, cgraph_node *callee_node)
> +                        bool ignore_stores, cgraph_node *callee_node,
> +                        bool record_adjustments)
>  {
>    auto_vec <modref_parm_map, 32> parm_map;
>    bool changed = false;
> @@ -902,11 +907,13 @@ merge_call_side_effects (modref_summary *cur_summary,
>      fprintf (dump_file, "\n");
>
>    /* Merge with callee's summary.  */
> -  changed |= cur_summary->loads->merge (callee_summary->loads, &parm_map);
> +  changed |= cur_summary->loads->merge (callee_summary->loads, &parm_map,
> +                                       record_adjustments);
>    if (!ignore_stores)
>      {
>        changed |= cur_summary->stores->merge (callee_summary->stores,
> -                                            &parm_map);
> +                                            &parm_map,
> +                                            record_adjustments);
>        if (!cur_summary->writes_errno
>           && callee_summary->writes_errno)
>         {
> @@ -941,7 +948,7 @@ get_access_for_fnspec (gcall *call, attr_fnspec
> &fnspec,
>      }
>    modref_access_node a = {0, -1, -1,
>                           map.parm_offset, map.parm_index,
> -                         map.parm_offset_known};
> +                         map.parm_offset_known, 0};
>    poly_int64 size_hwi;
>    if (size
>        && poly_int_tree_p (size, &size_hwi)
> @@ -1044,12 +1051,14 @@ process_fnspec (modref_summary *cur_summary,
>               cur_summary->loads->insert (0, 0,
>                                           get_access_for_fnspec (call,
>                                                                  fnspec, i,
> -                                                                map));
> +                                                                map),
> +                                         false);
>             if (cur_summary_lto)
>               cur_summary_lto->loads->insert (0, 0,
>                                               get_access_for_fnspec (call,
>
>  fnspec, i,
> -                                                                    map));
> +                                                                    map),
> +                                             false);
>           }
>      }
>    if (ignore_stores)
> @@ -1077,12 +1086,14 @@ process_fnspec (modref_summary *cur_summary,
>               cur_summary->stores->insert (0, 0,
>                                            get_access_for_fnspec (call,
>                                                                   fnspec,
> i,
> -                                                                 map));
> +                                                                 map),
> +                                          false);
>             if (cur_summary_lto)
>               cur_summary_lto->stores->insert (0, 0,
>                                                get_access_for_fnspec (call,
>
> fnspec, i,
> -
>  map));
> +                                                                     map),
> +                                              false);
>           }
>        if (fnspec.errno_maybe_written_p () && flag_errno_math)
>         {
> @@ -1168,7 +1179,7 @@ analyze_call (modref_summary *cur_summary,
> modref_summary_lto *cur_summary_lto,
>      }
>
>    merge_call_side_effects (cur_summary, stmt, callee_summary,
> ignore_stores,
> -                          callee_node);
> +                          callee_node, false);
>
>    return true;
>  }
> @@ -2134,6 +2145,7 @@ analyze_function (function *f, bool ipa)
>    if (!ipa)
>      {
>        bool changed = true;
> +      bool first = true;
>        while (changed)
>         {
>           changed = false;
> @@ -2144,13 +2156,14 @@ analyze_function (function *f, bool ipa)
>                            ignore_stores_p (current_function_decl,
>                                             gimple_call_flags
>                                                  (recursive_calls[i])),
> -                          fnode);
> +                          fnode, !first);
>               if (!summary->useful_p (ecf_flags, false))
>                 {
>                   remove_summary (lto, nolto, ipa);
>                   return;
>                 }
>             }
> +         first = false;
>         }
>      }
>    if (summary && !summary->useful_p (ecf_flags))
> @@ -2501,11 +2514,11 @@ read_modref_records (lto_input_block *ib, struct
> data_in *data_in,
>                     }
>                 }
>               modref_access_node a = {offset, size, max_size, parm_offset,
> -                                     parm_index, parm_offset_known};
> +                                     parm_index, parm_offset_known,
> false};
>               if (nolto_ref_node)
> -               nolto_ref_node->insert_access (a, max_accesses);
> +               nolto_ref_node->insert_access (a, max_accesses, false);
>               if (lto_ref_node)
> -               lto_ref_node->insert_access (a, max_accesses);
> +               lto_ref_node->insert_access (a, max_accesses, false);
>             }
>         }
>      }
> @@ -3187,16 +3200,18 @@ ipa_merge_modref_summary_after_inlining
> (cgraph_edge *edge)
>        if (!ignore_stores)
>         {
>           if (to_info && callee_info)
> -           to_info->stores->merge (callee_info->stores, &parm_map);
> +           to_info->stores->merge (callee_info->stores, &parm_map, false);
>           if (to_info_lto && callee_info_lto)
> -           to_info_lto->stores->merge (callee_info_lto->stores,
> &parm_map);
> +           to_info_lto->stores->merge (callee_info_lto->stores, &parm_map,
> +                                       false);
>         }
>        if (!(flags & (ECF_CONST | ECF_NOVOPS)))
>         {
>           if (to_info && callee_info)
> -           to_info->loads->merge (callee_info->loads, &parm_map);
> +           to_info->loads->merge (callee_info->loads, &parm_map, false);
>           if (to_info_lto && callee_info_lto)
> -           to_info_lto->loads->merge (callee_info_lto->loads, &parm_map);
> +           to_info_lto->loads->merge (callee_info_lto->loads, &parm_map,
> +                                      false);
>         }
>      }
>
> @@ -3346,7 +3361,7 @@ get_access_for_fnspec (cgraph_edge *e, attr_fnspec
> &fnspec,
>      size = TYPE_SIZE_UNIT (get_parm_type (e->callee->decl, i));
>    modref_access_node a = {0, -1, -1,
>                           map.parm_offset, map.parm_index,
> -                         map.parm_offset_known};
> +                         map.parm_offset_known, 0};
>    poly_int64 size_hwi;
>    if (size
>        && poly_int_tree_p (size, &size_hwi)
> @@ -3399,10 +3414,10 @@ propagate_unknown_call (cgraph_node *node,
>                 }
>               if (cur_summary)
>                 changed |= cur_summary->loads->insert
> -                 (0, 0, get_access_for_fnspec (e, fnspec, i, map));
> +                 (0, 0, get_access_for_fnspec (e, fnspec, i, map), false);
>               if (cur_summary_lto)
>                 changed |= cur_summary_lto->loads->insert
> -                 (0, 0, get_access_for_fnspec (e, fnspec, i, map));
> +                 (0, 0, get_access_for_fnspec (e, fnspec, i, map), false);
>             }
>         }
>        if (ignore_stores_p (node->decl, ecf_flags))
> @@ -3429,10 +3444,10 @@ propagate_unknown_call (cgraph_node *node,
>                 }
>               if (cur_summary)
>                 changed |= cur_summary->stores->insert
> -                 (0, 0, get_access_for_fnspec (e, fnspec, i, map));
> +                 (0, 0, get_access_for_fnspec (e, fnspec, i, map), false);
>               if (cur_summary_lto)
>                 changed |= cur_summary_lto->stores->insert
> -                 (0, 0, get_access_for_fnspec (e, fnspec, i, map));
> +                 (0, 0, get_access_for_fnspec (e, fnspec, i, map), false);
>             }
>         }
>        if (fnspec.errno_maybe_written_p () && flag_errno_math)
> @@ -3491,6 +3506,7 @@ static void
>  modref_propagate_in_scc (cgraph_node *component_node)
>  {
>    bool changed = true;
> +  bool first = true;
>    int iteration = 0;
>
>    while (changed)
> @@ -3628,11 +3644,12 @@ modref_propagate_in_scc (cgraph_node
> *component_node)
>               if (callee_summary)
>                 {
>                   changed |= cur_summary->loads->merge
> -                                 (callee_summary->loads, &parm_map);
> +                                 (callee_summary->loads, &parm_map,
> !first);
>                   if (!ignore_stores)
>                     {
>                       changed |= cur_summary->stores->merge
> -                                     (callee_summary->stores, &parm_map);
> +                                     (callee_summary->stores, &parm_map,
> +                                      !first);
>                       if (!cur_summary->writes_errno
>                           && callee_summary->writes_errno)
>                         {
> @@ -3644,11 +3661,13 @@ modref_propagate_in_scc (cgraph_node
> *component_node)
>               if (callee_summary_lto)
>                 {
>                   changed |= cur_summary_lto->loads->merge
> -                                 (callee_summary_lto->loads, &parm_map);
> +                                 (callee_summary_lto->loads, &parm_map,
> +                                  !first);
>                   if (!ignore_stores)
>                     {
>                       changed |= cur_summary_lto->stores->merge
> -                                     (callee_summary_lto->stores,
> &parm_map);
> +                                     (callee_summary_lto->stores,
> &parm_map,
> +                                      !first);
>                       if (!cur_summary_lto->writes_errno
>                           && callee_summary_lto->writes_errno)
>                         {
> @@ -3674,6 +3693,7 @@ modref_propagate_in_scc (cgraph_node *component_node)
>             }
>         }
>        iteration++;
> +      first = false;
>      }
>    if (dump_file)
>      fprintf (dump_file,
> diff --git a/gcc/params.opt b/gcc/params.opt
> index f414dc1a61c..223f0a02111 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -1013,6 +1013,10 @@ Maximum depth of DFS walk used by modref escape
> analysis.
>  Common Joined UInteger Var(param_modref_max_escape_points) Init(256)
> Param Optimization
>  Maximum number of escape points tracked by modref per SSA-name.
>
> +-param=modref-max-adjustments=
> +Common Joined UInteger Var(param_modref_max_adjustments) Init(8)
> IntegerRange (0, 255) Param Optimization
> +Maximum number of times a given range is adjusted during the dataflow
> +
>  -param=tm-max-aggregate-size=
>  Common Joined UInteger Var(param_tm_max_aggregate_size) Init(9) Param
> Optimization
>  Size in bytes after which thread-local aggregates should be instrumented
> with the logging functions instead of save/restore pairs.
> diff --git a/gcc/testsuite/gcc.dg/ipa/modref-1.c
> b/gcc/testsuite/gcc.dg/ipa/modref-1.c
> index 858567d35d5..5314e7dbbf7 100644
> --- a/gcc/testsuite/gcc.dg/ipa/modref-1.c
> +++ b/gcc/testsuite/gcc.dg/ipa/modref-1.c
> @@ -10,15 +10,15 @@ void a(char *ptr, char *ptr2)
>  __attribute__((noinline))
>  void b(char *ptr)
>  {
> -  a(ptr+1,&ptr[2]);
> +  a(ptr+1,&ptr[3]);
>  }
>
>  int main()
>  {
> -  char c[3]={0,1,0};
> +  char c[4]={0,1,0,0};
>    b(c);
> -  return c[0]+c[2];
> +  return c[0]+c[3];
>  }
>  /* Check that both param offsets are determined correctly.  */
>  /* { dg-final { scan-ipa-dump "param offset:1" "modref"  } } */
> -/* { dg-final { scan-ipa-dump "param offset:2" "modref"  } } */
> +/* { dg-final { scan-ipa-dump "param offset:3" "modref"  } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
> index 3ac217bafb8..a2b3b1102ec 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
> @@ -10,7 +10,7 @@ void a(char *ptr, char *ptr2)
>  __attribute__((noinline))
>  void b(char *ptr)
>  {
> -  a(ptr+1,&ptr[2]);
> +  a(ptr+1,&ptr[3]);
>  }
>
>  int main()
> @@ -22,5 +22,5 @@ int main()
>  /* Check that both param offsets are determined correctly and the
> computation
>     is optimized out.  */
>  /* { dg-final { scan-tree-dump "param offset:1" "modref1"  } } */
> -/* { dg-final { scan-tree-dump "param offset:2" "modref1"  } } */
> +/* { dg-final { scan-tree-dump "param offset:3" "modref1"  } } */
>  /* { dg-final { scan-tree-dump "return 0" "modref1"  } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
> new file mode 100644
> index 00000000000..15ae4acc03f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-8.c
> @@ -0,0 +1,25 @@
> +/* { dg-options "-O2 --param modref-max-adjustments=8
> -fdump-tree-modref1"  } */
> +/* { dg-do compile } */
> +void
> +set (char *p)
> +{
> +   p[1]=1;
> +   p[0]=0;
> +   p[2]=2;
> +   p[4]=4;
> +   p[3]=3;
> +}
> +
> +void
> +recurse (char *p, int n)
> +{
> +       *p = 0;
> +       if (n)
> +         recurse (p+1,n-1);
> +}
> +/* { dg-final { scan-tree-dump-not "param=modref-max-accesses" "modref1"
> } } */
> +/* { dg-final { scan-tree-dump "param=modref-max-adjustments" "modref1" }
> } */
> +/* In set all accesses should merge together.  */
> +/* { dg-final { scan-tree-dump "access: Parm 0 param offset:0 offset:0
> size:8 max_size:40" "modref1" } } */
> +/* In recurse we should cap the recrusion after 8 attempts and set
> max_size to -1.  */
> +/* { dg-final { scan-tree-dump "access: Parm 0 param offset:0 offset:0
> size:8 max_size:-1 adjusted 8 times" "modref1" } } */
>

Re: Merge stores/loads in modref summaries

Reply via email to