Hi Honza,
Thanks very much to get so many useful comments from you.
As a newbie to GCC, not sure whether my questions are described clearly
enough.  Thanks for your patience in advance.  :)


On 2019/6/20 21:47, Jan Hubicka wrote:
Hi,
some comments on the ipa part of the patch
(and thanks for working on it - this was on my TODO list for years)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index de82316d4b1..0d373a67d1b 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -553,6 +553,7 @@ cgraph_node::get_create (tree decl)
        fprintf (dump_file, "Introduced new external node "
                 "(%s) and turned into root of the clone tree.\n",
                 node->dump_name ());
+      node->profile_id = first_clone->profile_id;
      }
    else if (dump_file)
      fprintf (dump_file, "Introduced new external node "

This is independent of the rest of changes.  Do you have example where
this matters? The inline clones are created in ipa-inline while
ipa-profile is run before it, so I can not think of such a scenario.
I see you also copy profile_id from function to clone.  I would like to
know why you needed that.

Also you mention that you hit some ICEs. If fixes are independent of
rest of your changes, send them separately.

I copy the profile_id for cloned node as when in LTO ltrans, there is no
references or referrings info for the specialized node/cloned node, so it is difficult to track the node's reference in cgraph_edge::speculative_call_info. I use it mainly for debug purpose now.
Will remove it and split the patches in later version to include ICE fixes.


@@ -1110,6 +1111,7 @@ cgraph_edge::speculative_call_info (cgraph_edge *&direct,
    int i;
    cgraph_edge *e2;
    cgraph_edge *e = this;
+  cgraph_node *referred_node;
if (!e->indirect_unknown_callee)
      for (e2 = e->caller->indirect_calls;
@@ -1142,8 +1144,20 @@ cgraph_edge::speculative_call_info (cgraph_edge *&direct,
        && ((ref->stmt && ref->stmt == e->call_stmt)
            || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
        {
-       reference = ref;
-       break;
+       if (e2->indirect_info && e2->indirect_info->num_of_ics)
+         {
+           referred_node = dyn_cast<cgraph_node *> (ref->referred);
+           if (strstr (e->callee->name (), referred_node->name ()))
+             {
+               reference = ref;
+               break;
+             }
+         }
+       else
+         {
+           reference = ref;
+           break;
+         }
        }

This function is intended to return everything related to the
speculative call, so if you add multiple direct targets, i would expect
it to tage auto_vec of cgraph_nodes for direct and auto_vec of
references.

So will the signature becomes
cgraph_edge::speculative_call_info (auto_vec<cgraph_edge *> *direct,
                                    cgraph_edge *&indirect,
                                    auto_vec<ipa_ref *> *reference)

Seems a lot of code related to it, maybe should split to another patch.
And will the sequence of direct and reference in each auto_vec be strictly
mapped for iteration convenience?
Second question is "this" is a direct edge will be pushed to auto_vec "direct", how can it get its next direct edge here? From e->caller->callees?


/* Speculative edge always consist of all three components - direct edge,
@@ -1199,7 +1213,14 @@ cgraph_edge::resolve_speculation (tree callee_decl)
           in the functions inlined through it.  */
      }
    edge->count += e2->count;
-  edge->speculative = false;
+  if (edge->indirect_info && edge->indirect_info->num_of_ics)
+    {
+      edge->indirect_info->num_of_ics--;
+      if (edge->indirect_info->num_of_ics == 0)
+       edge->speculative = false;
+    }
+  else
+    edge->speculative = false;
    e2->speculative = false;
    ref->remove_reference ();
    if (e2->indirect_unknown_callee || e2->inline_failed)

This function should turn speculative call into direct call to DECL, so
I think it should remove all the other direct calls associated with stmt
and the indirect one.

There are now two cases - in first case you want to turn speculative
call into direct call or give up on especulation completely, while in
other case you want to only remove one of speculations.

I guess we want to have resolve_speculation(decl) for first and
remove_one_speculation(edge) for the second case?
The second case would be useful for the code below handling type
mismatches and also for inline when one of speculative targets seems not
useful to bother with.

So the logic will be:

if (edge->indirect_info->num_of_ics > 1)
        cgraph_edge::resolve_speculation (tree callee_decl);
else
        remove_one_speculation(edge);

cgraph_edge::resolve_speculation will call edge->speculative_call_info (e2, edge, ref) internally, at this time, e2 and ref will only contains one direct target?


@@ -1333,7 +1354,14 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
          e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
                                                     false);
          e->count = gimple_bb (e->call_stmt)->count;
-         e2->speculative = false;
+         if (e2->indirect_info && e2->indirect_info->num_of_ics)
+           {
+             e2->indirect_info->num_of_ics--;
+             if (e2->indirect_info->num_of_ics == 0)
+               e2->speculative = false;
+           }
+         else
+           e2->speculative = false;
          e2->count = gimple_bb (e2->call_stmt)->count;
          ref->speculative = false;
          ref->stmt = NULL;

  extern void debuginfo_early_init (void);
  extern void debuginfo_init (void);
@@ -1638,11 +1639,17 @@ struct GTY(()) cgraph_indirect_call_info
    int param_index;
    /* ECF flags determined from the caller.  */
    int ecf_flags;
-  /* Profile_id of common target obtrained from profile.  */
+  /* Profile_id of common target obtained from profile.  */
    int common_target_id;
    /* Probability that call will land in function with COMMON_TARGET_ID.  */
    int common_target_probability;
+ /* Profile_id of common target obtained from profile. */
+  int common_target_ids[GCOV_ICALL_TOPN_NCOUNTS / 2];
+  /* Probabilities that call will land in function with COMMON_TARGET_IDS.  */
+  int common_target_probabilities[GCOV_ICALL_TOPN_NCOUNTS / 2];

I would use vec of pairs (profile_id,probability) to hold this and do
not wire in GCOV_ICALL_TOPN_NCOUTS.  Most of time this vec will be just
NULL pointer so it will result in less memory overhead and will avoid
hard limit on number of speculations we want to do.

Note that the speculative edges may end up being redirected during IPA
optimization, for example when their target is cloned for particular
call context or when the function is detected identical to other
function.  So one can not preserve the mapping between targets and
profile ids.

Also this infrastructure is useful even w/o profile because we could use
ipa-devirt to devirtualize even when multiple polymorphic targets are
found. So I would not wirte in the limit GCOV_ICALL_TOPN_NCOUNTS and
just use dynamically allocated vectors instead.

With ipa-devirt it is possible that we know that there are precisely 2
possible polymorphic targets.  Other case i was considering was to
speculatively inline with -fPIC.  I.e. when one has interposiable call
for foo() one can create foo.localalias() for the definition visible to
compiler and then speculate

if (foo == foo.localalias())
   inline_path
else
   foo();

In these cases we may end up with indirect
call that has no associated indirect edge (which we do not support
currently), so it may be interesting to move speculative call info away
from indirect call info to cgraph edge structure (but that can be done
incrementally based on what you do now - this code is not completely
easy so lets do it step by step).

+
+struct GTY (()) indirect_target_info
+{
+  /* Profile_id of common target obtained from profile.  */
+  int common_target_id;
+  /* Probability that call will land in function with COMMON_TARGET_ID.  */
+  int common_target_probability;
+};
+

Tested with "vec<indirect_target_info, va_gc> *indirect_call_targets;"
works quite good, really much better.

Is it correct that profile_id/common_target_id will only be used to generate speculative edges in ipa-profile, all new specialized node/cloned
node will take(not copy) the speculative property from original node, so
was the reason that no need to clone profile id?



@@ -212,6 +216,46 @@ ipa_profile_generate_summary (void)
                      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION 
(node->decl),
                                                      stmt, h);
                    }
+                 else if (h && type == HIST_TYPE_INDIR_CALL_TOPN)
+                   {
+                     unsigned j;
+                     struct cgraph_edge *e = node->get_edge (stmt);
+                     if (e && !e->indirect_unknown_callee)
+                       continue;

I suppose you are going to change this for Martin's implementation, so i
have skipped it for now.
@@ -558,7 +601,8 @@ ipa_profile (void)
        {
          if (n->count.initialized_p ())
            nindirect++;
-         if (e->indirect_info->common_target_id)
+         if (e->indirect_info->common_target_id
+             || (e->indirect_info && e->indirect_info->num_of_ics == 1))
            {
              if (!node_map_initialized)
                init_node_map (false);
@@ -613,7 +657,7 @@ ipa_profile (void)
                      if (dump_file)
                        fprintf (dump_file,
                                 "Not speculating: "
-                                "parameter count mistmatch\n");
+                                "parameter count mismatch\n");
                    }
                  else if (e->indirect_info->polymorphic
                           && !opt_for_fn (n->decl, flag_devirtualize)
@@ -655,7 +699,130 @@ ipa_profile (void)
                  nunknown++;
                }
            }
-        }
+         if (e->indirect_info && e->indirect_info->num_of_ics > 1)
+           {
+             if (in_lto_p)
+               {
+                 if (dump_file)
+                   {
+                     fprintf (dump_file,
+                              "Updating hotness threshold in LTO mode.\n");
+                     fprintf (dump_file, "Updated min count: %" PRId64 "\n",
+                              (int64_t) threshold);
+                   }
+                 set_hot_bb_threshold (threshold
+                                       / e->indirect_info->num_of_ics);
+               }

Why do you need different paths for one target and for multiple targets?
Also why LTO is different here?
Will remove the logic and the above change based on Martin'
implementation.

+             if (!node_map_initialized)
+               init_node_map (false);
+             node_map_initialized = true;
+             ncommon++;
+             unsigned speculative = 0;
+             for (i = 0; i < (int)e->indirect_info->num_of_ics; i++)
+               {
+                 n2 = find_func_by_profile_id (
+                   e->indirect_info->common_target_ids[i]);
+                 if (n2)
+                   {
+                     if (dump_file)
+                       {
+                         fprintf (
+                           dump_file,
+                           "Indirect call -> direct call from"
+                           " other module %s => %s, prob %3.2f\n",
+                           n->dump_name (), n2->dump_name (),
+                           e->indirect_info->common_target_probabilities[i]
+                             / (float) REG_BR_PROB_BASE);
+                       }
+                     if (e->indirect_info->common_target_probabilities[i]
+                         < REG_BR_PROB_BASE / 2)
+                       {
+                         nuseless++;
+                         if (dump_file)
+                           fprintf (
+                             dump_file,
+                             "Not speculating: probability is too low.\n");
+                       }
+                     else if (!e->maybe_hot_p ())
+                       {
+                         nuseless++;
+                         if (dump_file)
+                           fprintf (dump_file,
+                                    "Not speculating: call is cold.\n");
+                       }
+                     else if (n2->get_availability () <= AVAIL_INTERPOSABLE
+                              && n2->can_be_discarded_p ())
+                       {
+                         nuseless++;
+                         if (dump_file)
+                           fprintf (dump_file,
+                                    "Not speculating: target is overwritable "
+                                    "and can be discarded.\n");
+                       }
+                     else if (ipa_node_params_sum && ipa_edge_args_sum
+                              && (!vec_safe_is_empty (
+                                IPA_NODE_REF (n2)->descriptors))
+                              && ipa_get_param_count (IPA_NODE_REF (n2))
+                                   != ipa_get_cs_argument_count (
+                                     IPA_EDGE_REF (e))
+                              && (ipa_get_param_count (IPA_NODE_REF (n2))
+                                    >= ipa_get_cs_argument_count (
+                                      IPA_EDGE_REF (e))
+                                  || !stdarg_p (TREE_TYPE (n2->decl))))
+                       {
+                         nmismatch++;
+                         if (dump_file)
+                           fprintf (dump_file, "Not speculating: "
+                                               "parameter count mismatch\n");
+                       }
+                     else if (e->indirect_info->polymorphic
+                              && !opt_for_fn (n->decl, flag_devirtualize)
+                              && !possible_polymorphic_call_target_p (e, n2))
+                       {
+                         nimpossible++;
+                         if (dump_file)
+                           fprintf (dump_file,
+                                    "Not speculating: "
+                                    "function is not in the polymorphic "
+                                    "call target list\n");
+                       }
+                     else
+                       {
+                         /* Target may be overwritable, but profile says that
+                            control flow goes to this particular implementation
+                            of N2.  Speculate on the local alias to allow
+                            inlining.
+                            */
+                         if (!n2->can_be_discarded_p ())
+                           {
+                             cgraph_node *alias;
+                             alias = dyn_cast<cgraph_node *> (
+                               n2->noninterposable_alias ());
+                             if (alias)
+                               n2 = alias;
+                           }
+                         nconverted++;
+                         e->make_speculative (
+                           n2, e->count.apply_probability (
+                                 e->indirect_info
+                                   ->common_target_probabilities[i]));
+                         update = true;
+                         speculative++;
+                       }
+                   }
+                 else
+                   {
+                     if (dump_file)
+                       fprintf (dump_file,
+                                "Function with profile-id %i not found.\n",
+                                e->indirect_info->common_target_ids[i]);
+                     nunknown++;
+                   }
+               }
+             if (speculative < e->indirect_info->num_of_ics)
+               e->indirect_info->num_of_ics = speculative;
+           }
+       }
         if (update)
         ipa_update_overall_fn_summary (n);
       }
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 79b250c3943..30347691029 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -587,6 +587,11 @@ ipa_merge_profiles (struct cgraph_node *dst,
        update_max_bb_count ();
        compute_function_frequency ();
        pop_cfun ();
+      /* When src is speculative, clone the referrings.  */
+      if (src->indirect_call_target)
+       for (e = src->callers; e; e = e->next_caller)
+         if (e->callee == src && e->speculative)
+           dst->clone_referring (src);

This looks wrong. Why do you need to copy all references from src
to target?
Clonning the referrings from callee to the merged callee, not references.
the information from "caller to src" is not passed to "caller to dst" when
merge profiles, it is a workaround when fixing a SPEC ICE.  Will root
cause it later.


+                           /* Speculative calls consist of two edges - direct
+                              and indirect.  Duplicate the whole thing and
+                              distribute frequencies accordingly.  */
+                           if (edge->speculative)
I think here you want to handle the whole group with multiple targets.
OK.  Also an ICE fix in SPEC.


Thanks
Xionghu

+                             {
+                               struct ipa_ref *ref;
+
+                               gcc_assert (!edge->indirect_unknown_callee);
+                               old_edge->speculative_call_info (direct,
+                                                                indirect, ref);
+
+                               profile_count indir_cnt = indirect->count;
+                               indirect
+                                 = indirect->clone (id->dst_node, call_stmt,
+                                                    gimple_uid (stmt), num,
+                                                    den, true);
+
+                               profile_probability prob
+                                 = indir_cnt.probability_in (old_cnt
+                                                             + indir_cnt);
+                               indirect->count
+                                 = copy_basic_block->count.apply_probability (
+                                   prob);
+                               edge->count
+                                 = copy_basic_block->count - indirect->count;
+                               id->dst_node->clone_reference (ref, stmt);
+                             }
+                           else
+                             edge->count = copy_basic_block->count;
+                         }
+                       /* If the indirect call contains more than one indirect
+                          targets, need clone all speculative edges here.  */
+                       if (old_edge && old_edge->next_callee
+                           && old_edge->speculative && indirect
+                           && indirect->indirect_info
+                           && indirect->indirect_info->num_of_ics > 1)
+                         {
+                           edge = old_edge->next_callee;
+                           old_edge = old_edge->next_callee;
+                           if (edge->speculative)
+                             next_speculative = true;
+                         }
+                     }
+                   while (next_speculative);
+                 }
                  break;
case CB_CGE_MOVE_CLONES:

Reply via email to