On Wed, 20 May 2020, Jan Hubicka wrote:

> Hi,
> > 
> > The patch looks reasonable - the wins are not entirely clear to me
> > since you mix in LTO streaming format optimizations.  You promised
> > to improve collisions on the SCC merging side but those are not
> > spectacular (if present at all) in the above data?
> 
> Thanks for looking into that. Collision ratios was imporving on Firefox
> that has more anonymous types and their derivations, but I was currently
> seeing an ICE building it.  I will debug that.
> 
> I think, quite surprisingly, most stream size benefits come from not
> streaming hash codes for integer_csts and indentifier_nodes.
> These are common and hash codes are large + hard to compress.
> 
> I was thinking about trying making them artifically smaller (32bit only)
> once we nail down the unecessary scc chains.  I will look into that now
> - last time i did stats it was mostly anonymous types and that is why I
> started to worry about them first.
> > 
> > The patch itself looks really nice.  The only comment I have is
> > with respect to the new local_tree_p predicate and its relation
> > to tree_is_indexable.  We should be able to assert that
> > local trees are not indexable and indexable trees never are local.
> 
> This is not true.  Anonymous namespace types are indexable (since they
> are shared across function bodies) but they are local to translation
> unit. Similarly non-public decls.
> 
> > Does this make the predicates equal? ;)  It would be nice if
> > they would cross-reference each other (maybe even assert at least
> > in one way), maybe put next to each other and share parts?
> > We can do this as followup.
> 
> If you are OK doing more stuff as folowup, I would be happy.  I think it
> is good base to fix other issues (track down longer chain in merging;
> sanify scc streaming more etc).
> 
> I just tried to avoid snowballing, since the patch already mixes few
> things together.

Yeah, let's do everything else as followup - the patch as-is looks
good to me.

Thanks,
Richard.

> Thanks,
> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > > gcc/ChangeLog:
> > > 
> > > 2020-05-19  Jan Hubicka  <hubi...@ucw.cz>
> > > 
> > >   * lto-streamer-in.c (lto_input_scc): Add SHARED_SCC parameter.
> > >   (lto_input_tree_1): Strenghten sanity check.
> > >   (lto_input_tree): Update call of lto_input_scc.
> > >   * lto-streamer-out.c: Include ipa-utils.h
> > >   (create_output_block): Initialize local_trees if merigng is going
> > >   to happen.
> > >   (destroy_output_block): Destroy local_trees.
> > >   (DFS): Add max_local_entry.
> > >   (local_tree_p): New function.
> > >   (DFS::DFS): Initialize and maintain it.
> > >   (DFS::DFS_write_tree): Decide on streaming format.
> > >   (lto_output_tree): Stream inline singleton SCCs
> > >   * lto-streamer.h (enum LTO_tags): Add LTO_trees.
> > >   (struct output_block): Add local_trees.
> > >   (lto_input_scc): Update prototype.
> > > 
> > > gcc/lto/ChangeLog:
> > > 
> > > 2020-05-19  Jan Hubicka  <hubi...@ucw.cz>
> > > 
> > >   * lto-common.c (compare_tree_sccs_1): Sanity check that we never
> > >   read TRANSLATION_UNIT_DECL.
> > >   (process_dref): Break out from ...
> > >   (unify_scc): ... here.
> > >   (process_new_tree): Break out from ...
> > >   (lto_read_decls): ... here; handle streaming of singleton trees.
> > >   (print_lto_report_1): Update statistics.
> > > 
> > > diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
> > > index 244f5b8aa5c..6d571c4344a 100644
> > > --- a/gcc/lto-streamer-in.c
> > > +++ b/gcc/lto-streamer-in.c
> > > @@ -1424,16 +1424,17 @@ lto_read_tree (class lto_input_block *ib, class 
> > > data_in *data_in,
> > >  
> > >  
> > >  /* Populate the reader cache with trees materialized from the SCC
> > > -   following in the IB, DATA_IN stream.  */
> > > +   following in the IB, DATA_IN stream.
> > > +   If SHARED_SCC is true we input LTO_tree_scc.  */
> > >  
> > >  hashval_t
> > >  lto_input_scc (class lto_input_block *ib, class data_in *data_in,
> > > -        unsigned *len, unsigned *entry_len)
> > > +        unsigned *len, unsigned *entry_len, bool shared_scc)
> > >  {
> > >    /* A blob of unnamed tree nodes, fill the cache from it and
> > >       recurse.  */
> > >    unsigned size = streamer_read_uhwi (ib);
> > > -  hashval_t scc_hash = streamer_read_uhwi (ib);
> > > +  hashval_t scc_hash = shared_scc ? streamer_read_uhwi (ib) : 0;
> > >    unsigned scc_entry_len = 1;
> > >  
> > >    if (size == 1)
> > > @@ -1456,7 +1457,8 @@ lto_input_scc (class lto_input_block *ib, class 
> > > data_in *data_in,
> > >         || (tag >= LTO_field_decl_ref && tag <= LTO_global_decl_ref)
> > >         || tag == LTO_tree_pickle_reference
> > >         || tag == LTO_integer_cst
> > > -       || tag == LTO_tree_scc)
> > > +       || tag == LTO_tree_scc
> > > +       || tag == LTO_trees)
> > >       gcc_unreachable ();
> > >  
> > >     result = streamer_alloc_tree (ib, data_in, tag);
> > > @@ -1522,7 +1524,7 @@ lto_input_tree_1 (class lto_input_block *ib, class 
> > > data_in *data_in,
> > >                            (a, len, TYPE_PRECISION (type)));
> > >        streamer_tree_cache_append (data_in->reader_cache, result, hash);
> > >      }
> > > -  else if (tag == LTO_tree_scc)
> > > +  else if (tag == LTO_tree_scc || tag == LTO_trees)
> > >      gcc_unreachable ();
> > >    else
> > >      {
> > > @@ -1538,11 +1540,11 @@ lto_input_tree (class lto_input_block *ib, class 
> > > data_in *data_in)
> > >  {
> > >    enum LTO_tags tag;
> > >  
> > > -  /* Input and skip SCCs.  */
> > > -  while ((tag = streamer_read_record_start (ib)) == LTO_tree_scc)
> > > +  /* Input pickled trees needed to stream in the refrence.  */
> > > +  while ((tag = streamer_read_record_start (ib)) == LTO_trees)
> > >      {
> > >        unsigned len, entry_len;
> > > -      lto_input_scc (ib, data_in, &len, &entry_len);
> > > +      lto_input_scc (ib, data_in, &len, &entry_len, false);
> > >  
> > >        /* Register DECLs with the debuginfo machinery.  */
> > >        while (!dref_queue.is_empty ())
> > > @@ -1551,7 +1553,15 @@ lto_input_tree (class lto_input_block *ib, class 
> > > data_in *data_in)
> > >     debug_hooks->register_external_die (e.decl, e.sym, e.off);
> > >   }
> > >      }
> > > -  return lto_input_tree_1 (ib, data_in, tag, 0);
> > > +  tree t = lto_input_tree_1 (ib, data_in, tag, 0);
> > > +
> > > +  if (!dref_queue.is_empty ())
> > > +    {
> > > +      dref_entry e = dref_queue.pop ();
> > > +      debug_hooks->register_external_die (e.decl, e.sym, e.off);
> > > +      gcc_checking_assert (dref_queue.is_empty ());
> > > +    }
> > > +  return t;
> > >  }
> > >  
> > >  
> > > diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
> > > index 3d94324881f..1c5b8f42c9b 100644
> > > --- a/gcc/lto-streamer-out.c
> > > +++ b/gcc/lto-streamer-out.c
> > > @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #include "tree-dfa.h"
> > >  #include "file-prefix-map.h" /* remap_debug_filename()  */
> > >  #include "output.h"
> > > +#include "ipa-utils.h"
> > >  
> > >  
> > >  static void lto_write_tree (struct output_block*, tree, bool);
> > > @@ -75,6 +76,10 @@ create_output_block (enum lto_section_type 
> > > section_type)
> > >  
> > >    ob->section_type = section_type;
> > >    ob->decl_state = lto_get_out_decl_state ();
> > > +  /* Only global decl stream in non-wpa will ever be considered by tree
> > > +     merging.  */
> > > +  if (!flag_wpa && section_type == LTO_section_decls)
> > > +    ob->local_trees = new (hash_set <tree>);
> > >    ob->main_stream = XCNEW (struct lto_output_stream);
> > >    ob->string_stream = XCNEW (struct lto_output_stream);
> > >    ob->writer_cache = streamer_tree_cache_create (!flag_wpa, true, false);
> > > @@ -100,6 +105,7 @@ destroy_output_block (struct output_block *ob)
> > >  
> > >    delete ob->string_hash_table;
> > >    ob->string_hash_table = NULL;
> > > +  delete ob->local_trees;
> > >  
> > >    free (ob->main_stream);
> > >    free (ob->string_stream);
> > > @@ -532,6 +538,8 @@ private:
> > >      bool ref_p;
> > >      bool this_ref_p;
> > >    };
> > > +  /* Maximum index of scc stack containing a local tree.  */
> > > +  int max_local_entry;
> > >  
> > >    static int scc_entry_compare (const void *, const void *);
> > >  
> > > @@ -550,6 +558,33 @@ private:
> > >    struct obstack sccstate_obstack;
> > >  };
> > >  
> > > +/* Return true if type can be merged.
> > > +   TRANSLATION_UNIT_DECL is handled specially since references to it does
> > > +   not make other trees local as well.  */
> > > +
> > > +static bool
> > > +local_tree_p (tree t)
> > > +{
> > > +  switch (TREE_CODE (t))
> > > +    {
> > > +    case LABEL_DECL:
> > > +      return true;
> > > +    case NAMESPACE_DECL:
> > > +      return !DECL_NAME (t);
> > > +    case VAR_DECL:
> > > +    case FUNCTION_DECL:
> > > +      return !TREE_PUBLIC (t) && !DECL_EXTERNAL (t);
> > > +    case RECORD_TYPE:
> > > +    case UNION_TYPE:
> > > +    case ENUMERAL_TYPE:
> > > +      return TYPE_MAIN_VARIANT (t) == t
> > > +      && odr_type_p (t) && type_with_linkage_p (t)
> > > +      && type_in_anonymous_namespace_p (t);
> > > +    default:
> > > +      return false;
> > > +    }
> > > +}
> > > +
> > >  /* Emit the physical representation of tree node EXPR to output block OB,
> > >     using depth-first search on the subgraph.  If THIS_REF_P is true, the
> > >     leaves of EXPR are emitted as references via lto_output_tree_ref.
> > > @@ -560,6 +595,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool 
> > > ref_p, bool this_ref_p,
> > >     bool single_p)
> > >  {
> > >    unsigned int next_dfs_num = 1;
> > > +
> > > +  max_local_entry = -1;
> > >    gcc_obstack_init (&sccstate_obstack);
> > >    DFS_write_tree (ob, NULL, expr, ref_p, this_ref_p);
> > >    while (!worklist_vec.is_empty ())
> > > @@ -586,6 +623,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool 
> > > ref_p, bool this_ref_p,
> > >     scc_entry e = { expr, 0 };
> > >     /* Not yet visited.  DFS recurse and push it onto the stack.  */
> > >     *slot = cstate = XOBNEW (&sccstate_obstack, struct sccs);
> > > +   if (ob->local_trees && local_tree_p (expr))
> > > +     max_local_entry = sccstack.length ();
> > >     sccstack.safe_push (e);
> > >     cstate->dfsnum = next_dfs_num++;
> > >     cstate->low = cstate->dfsnum;
> > > @@ -640,7 +679,26 @@ DFS::DFS (struct output_block *ob, tree expr, bool 
> > > ref_p, bool this_ref_p,
> > >        any merging there.  */
> > >     hashval_t scc_hash = 0;
> > >     unsigned scc_entry_len = 0;
> > > -   if (!flag_wpa)
> > > +   bool local_to_unit = !ob->local_trees
> > > +                        || max_local_entry >= (int)first;
> > > +
> > > +   /* Remember that trees are local so info gets propagated to other
> > > +      SCCs.  */
> > > +   if (local_to_unit && ob->local_trees)
> > > +     {
> > > +       for (unsigned i = 0; i < size; ++i)
> > > +         ob->local_trees->add (sccstack[first + i].t);
> > > +     }
> > > +
> > > +   /* As a special case do not stream TRANSLATION_UNIT_DECL as shared
> > > +      tree.  We can not mark it local because references to it does not
> > > +      make other trees local (all global decls reffer to it via
> > > +      CONTEXT).  */
> > > +   if (size == 1
> > > +       && TREE_CODE (sccstack[first].t) == TRANSLATION_UNIT_DECL)
> > > +     local_to_unit = true;
> > > +
> > > +   if (!local_to_unit)
> > >       {
> > >         scc_hash = hash_scc (ob, first, size, ref_p, this_ref_p);
> > >  
> > > @@ -672,10 +730,44 @@ DFS::DFS (struct output_block *ob, tree expr, bool 
> > > ref_p, bool this_ref_p,
> > >         gcc_checking_assert (scc_entry_len == 1);
> > >       }
> > >  
> > > -   /* Write LTO_tree_scc.  */
> > > -   streamer_write_record_start (ob, LTO_tree_scc);
> > > -   streamer_write_uhwi (ob, size);
> > > -   streamer_write_uhwi (ob, scc_hash);
> > > +   worklist_vec.pop ();
> > > +
> > > +   /* If we are not streaming global section tree sharing will not
> > > +      be done, but we must mark block of trees so streamer can
> > > +      distinguish it from reference being streamed.  */
> > > +   if (ob->section_type != LTO_section_decls)
> > > +     {
> > > +       /* If this is the original tree we stream and it forms SCC
> > > +          by itself then we do not need to stream SCC at all.  */
> > > +       if (worklist_vec.is_empty () && first == 0 && size == 1)
> > > +          return;
> > > +       streamer_write_record_start (ob, LTO_trees);
> > > +       streamer_write_uhwi (ob, size);
> > > +     }
> > > +   /* Write LTO_tree_scc if sharing is going to be performned.  */
> > > +   else if (!local_to_unit
> > > +            /* These are special since sharing is not done by tree
> > > +               merging machinery.  We can not special case them earlier
> > > +               because we still need to compute hash for further sharing
> > > +               of trees referring to them.  */
> > > +            && (size != 1
> > > +                || (TREE_CODE (sccstack[first].t) != IDENTIFIER_NODE
> > > +                    && (TREE_CODE (sccstack[first].t) != INTEGER_CST
> > > +                        || TREE_OVERFLOW (sccstack[first].t)))))
> > > +
> > > +     {
> > > +       gcc_checking_assert (ob->section_type == LTO_section_decls);
> > > +       streamer_write_record_start (ob, LTO_tree_scc);
> > > +       streamer_write_uhwi (ob, size);
> > > +       streamer_write_uhwi (ob, scc_hash);
> > > +     }
> > > +   /* Non-trivial SCCs must be packed to trees blocks so forward
> > > +      references work correctly.  */
> > > +   else if (size != 1)
> > > +     {
> > > +        streamer_write_record_start (ob, LTO_trees);
> > > +        streamer_write_uhwi (ob, size);
> > > +     }
> > >  
> > >     /* Write size-1 SCCs without wrapping them inside SCC bundles.
> > >        All INTEGER_CSTs need to be handled this way as we need
> > > @@ -722,10 +814,11 @@ DFS::DFS (struct output_block *ob, tree expr, bool 
> > > ref_p, bool this_ref_p,
> > >  
> > >     /* Finally truncate the vector.  */
> > >     sccstack.truncate (first);
> > > +   if ((int)first <= max_local_entry)
> > > +     max_local_entry = first - 1;
> > >  
> > >     if (from_state)
> > >       from_state->low = MIN (from_state->low, cstate->low);
> > > -   worklist_vec.pop ();
> > >     continue;
> > >   }
> > >  
> > > @@ -1569,7 +1662,14 @@ DFS::DFS_write_tree (struct output_block *ob, sccs 
> > > *from_state,
> > >  
> > >    /* Check if we already streamed EXPR.  */
> > >    if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL))
> > > -    return;
> > > +    {
> > > +      /* Refernece to a local tree makes entry also local.  We always 
> > > process
> > > +  top of stack entry, so set max to number of entries in stack - 1.  */
> > > +      if (ob->local_trees
> > > +   && ob->local_trees->contains (expr))
> > > + max_local_entry = sccstack.length () - 1;
> > > +      return;
> > > +    }
> > >  
> > >    worklist w;
> > >    w.expr = expr;
> > > @@ -1641,15 +1741,20 @@ lto_output_tree (struct output_block *ob, tree 
> > > expr,
> > >        DFS (ob, expr, ref_p, this_ref_p, false);
> > >        in_dfs_walk = false;
> > >  
> > > -      /* Finally append a reference to the tree we were writing.
> > > -  ???  If expr ended up as a singleton we could have
> > > -  inlined it here and avoid outputting a reference.  */
> > > +      /* Finally append a reference to the tree we were writing.  */
> > >        existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, 
> > > &ix);
> > > -      gcc_assert (existed_p);
> > > -      streamer_write_record_start (ob, LTO_tree_pickle_reference);
> > > -      streamer_write_uhwi (ob, ix);
> > > -      streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
> > > -                    lto_tree_code_to_tag (TREE_CODE (expr)));
> > > +
> > > +      /* DFS walk above possibly skipped streaming EXPR itself to let us 
> > > inline
> > > +  it.  */
> > > +      if (!existed_p)
> > > + lto_output_tree_1 (ob, expr, 0, ref_p, this_ref_p);
> > > +      else
> > > + {
> > > +   streamer_write_record_start (ob, LTO_tree_pickle_reference);
> > > +   streamer_write_uhwi (ob, ix);
> > > +   streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS,
> > > +                        lto_tree_code_to_tag (TREE_CODE (expr)));
> > > + }
> > >        if (streamer_dump_file)
> > >   {
> > >     print_node_brief (streamer_dump_file, "   Finished SCC of ",
> > > diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> > > index 76aa6fe34b8..a466fb8b329 100644
> > > --- a/gcc/lto-streamer.h
> > > +++ b/gcc/lto-streamer.h
> > > @@ -178,6 +178,9 @@ enum LTO_tags
> > >    /* Special for global streamer.  A blob of unnamed tree nodes.  */
> > >    LTO_tree_scc,
> > >  
> > > +  /* Sequence of trees.  */
> > > +  LTO_trees,
> > > +
> > >    /* References to indexable tree nodes.  These objects are stored in
> > >       tables that are written separately from the function bodies that
> > >       reference them.  This way they can be instantiated even when the
> > > @@ -751,6 +754,9 @@ struct output_block
> > >    /* Cache of nodes written in this section.  */
> > >    struct streamer_tree_cache_d *writer_cache;
> > >  
> > > +  /* All trees identified as local to the unit streamed.  */
> > > +  hash_set<tree> *local_trees;
> > > +
> > >    /* All data persistent across whole duration of output block
> > >       can go here.  */
> > >    struct obstack obstack;
> > > @@ -901,7 +907,7 @@ tree lto_input_tree_ref (class lto_input_block *, 
> > > class data_in *,
> > >  void lto_tag_check_set (enum LTO_tags, int, ...);
> > >  void lto_init_eh (void);
> > >  hashval_t lto_input_scc (class lto_input_block *, class data_in *,
> > > -                  unsigned *, unsigned *);
> > > +                  unsigned *, unsigned *, bool);
> > >  tree lto_input_tree_1 (class lto_input_block *, class data_in *,
> > >                  enum LTO_tags, hashval_t hash);
> > >  tree lto_input_tree (class lto_input_block *, class data_in *);
> > > diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
> > > index 682a82d61ba..d04b1c9ca7b 100644
> > > --- a/gcc/lto/lto-common.c
> > > +++ b/gcc/lto/lto-common.c
> > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #include "builtins.h"
> > >  #include "lto-common.h"
> > >  #include "tree-pretty-print.h"
> > > +#include "print-tree.h"
> > >  
> > >  /* True when no new types are going to be streamd from the global 
> > > stream.  */
> > >  
> > > @@ -1054,6 +1055,7 @@ static unsigned long num_prevailing_types;
> > >  static unsigned long num_type_scc_trees;
> > >  static unsigned long total_scc_size;
> > >  static unsigned long num_sccs_read;
> > > +static unsigned long num_unshared_trees_read;
> > >  static unsigned long total_scc_size_merged;
> > >  static unsigned long num_sccs_merged;
> > >  static unsigned long num_scc_compares;
> > > @@ -1088,6 +1090,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
> > >    compare_values (TREE_CODE);
> > >    code = TREE_CODE (t1);
> > >  
> > > +  /* If we end up comparing translation unit decls we either forgot to 
> > > mark
> > > +     some SCC as local or we compare too much.  */
> > > +  gcc_checking_assert (code != TRANSLATION_UNIT_DECL);
> > > +
> > >    if (!TYPE_P (t1))
> > >      {
> > >        compare_values (TREE_SIDE_EFFECTS);
> > > @@ -1626,6 +1632,28 @@ cmp_tree (const void *p1_, const void *p2_)
> > >    return ((uintptr_t)p1[1] < (uintptr_t)p2[1]) ? -1 : 1;
> > >  }
> > >  
> > > +/* New scc of size 1 containing T was streamed in from DATA_IN and not 
> > > merged.
> > > +   Register it to reader cache at index FROM.  */
> > > +
> > > +static void
> > > +process_dref (class data_in *data_in, tree t, unsigned from)
> > > +{
> > > +  struct streamer_tree_cache_d *cache = data_in->reader_cache;
> > > +  /* If we got a debug reference queued, see if the prevailing
> > > +     tree has a debug reference and if not, register the one
> > > +     for the tree we are about to throw away.  */
> > > +  if (dref_queue.length () == 1)
> > > +    {
> > > +      dref_entry e = dref_queue.pop ();
> > > +      gcc_assert (e.decl
> > > +           == streamer_tree_cache_get_tree (cache, from));
> > > +      const char *sym;
> > > +      unsigned HOST_WIDE_INT off;
> > > +      if (!debug_hooks->die_ref_for_decl (t, &sym, &off))
> > > + debug_hooks->register_external_die (t, e.sym, e.off);
> > > +    }
> > > +}
> > > +
> > >  /* Try to unify the SCC with nodes FROM to FROM + LEN in CACHE and
> > >     hash value SCC_HASH with an already recorded SCC.  Return true if
> > >     that was successful, otherwise return false.  */
> > > @@ -1646,22 +1674,16 @@ unify_scc (class data_in *data_in, unsigned from,
> > >      {
> > >        tree t = streamer_tree_cache_get_tree (cache, from + i);
> > >        scc->entries[i] = t;
> > > -      /* Do not merge SCCs with local entities inside them.  Also do
> > > -  not merge TRANSLATION_UNIT_DECLs and anonymous namespaces
> > > -  and types therein types.  */
> > > -      if (TREE_CODE (t) == TRANSLATION_UNIT_DECL
> > > -   || (VAR_OR_FUNCTION_DECL_P (t)
> > > -       && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
> > > -   || TREE_CODE (t) == LABEL_DECL
> > > -   || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t))
> > > -   || (TYPE_P (t)
> > > -       && type_with_linkage_p (TYPE_MAIN_VARIANT (t))
> > > -       && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t))))
> > > - {
> > > -   /* Avoid doing any work for these cases and do not worry to
> > > -      record the SCCs for further merging.  */
> > > -   return false;
> > > - }
> > > +      /* These types should be streamed as unshared.  */
> > > +      gcc_checking_assert
> > > +  (!(TREE_CODE (t) == TRANSLATION_UNIT_DECL
> > > +     || (VAR_OR_FUNCTION_DECL_P (t)
> > > +         && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t)))
> > > +     || TREE_CODE (t) == LABEL_DECL
> > > +     || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t))
> > > +     || (TYPE_P (t)
> > > +         && type_with_linkage_p (TYPE_MAIN_VARIANT (t))
> > > +         && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t)))));
> > >      }
> > >  
> > >    /* Look for the list of candidate SCCs to compare against.  */
> > > @@ -1712,21 +1734,7 @@ unify_scc (class data_in *data_in, unsigned from,
> > >        to the tree node mapping computed by compare_tree_sccs.  */
> > >     if (len == 1)
> > >       {
> > > -       /* If we got a debug reference queued, see if the prevailing
> > > -          tree has a debug reference and if not, register the one
> > > -          for the tree we are about to throw away.  */
> > > -       if (dref_queue.length () == 1)
> > > -         {
> > > -           dref_entry e = dref_queue.pop ();
> > > -           gcc_assert (e.decl
> > > -                       == streamer_tree_cache_get_tree (cache, from));
> > > -           const char *sym;
> > > -           unsigned HOST_WIDE_INT off;
> > > -           if (!debug_hooks->die_ref_for_decl (pscc->entries[0], &sym,
> > > -                                               &off))
> > > -             debug_hooks->register_external_die (pscc->entries[0],
> > > -                                                 e.sym, e.off);
> > > -         }
> > > +       process_dref (data_in, pscc->entries[0], from);
> > >         lto_maybe_register_decl (data_in, pscc->entries[0], from);
> > >         streamer_tree_cache_replace_tree (cache, pscc->entries[0], from);
> > >       }
> > > @@ -1785,7 +1793,65 @@ unify_scc (class data_in *data_in, unsigned from,
> > >    return unified_p;
> > >  }
> > >  
> > > +typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash;
> > > +
> > > +/* Do registering necessary once new tree fully streamed in (including 
> > > all
> > > +   trees it reffers to).  */
> > > +
> > > +static void
> > > +process_new_tree (tree t, hash_map <code_id_hash, unsigned> *hm,
> > > +           unsigned index, unsigned *total, class data_in *data_in)
> > > +{
> > > +  /* Reconstruct the type variant and pointer-to/reference-to
> > > +     chains.  */
> > > +  if (TYPE_P (t))
> > > +    {
> > > +      /* Map the tree types to their frequencies.  */
> > > +      if (flag_lto_dump_type_stats)
> > > + {
> > > +   unsigned key = (unsigned) TREE_CODE (t);
> > > +   unsigned *countp = hm->get (key);
> > > +   hm->put (key, countp ? (*countp) + 1 : 1);
> > > +   (*total)++;
> > > + }
> > > +
> > > +      num_prevailing_types++;
> > > +      lto_fixup_prevailing_type (t);
> > >  
> > > +      /* Compute the canonical type of all non-ODR types.
> > > +  Delay ODR types for the end of merging process - the canonical
> > > +  type for those can be computed using the (unique) name however
> > > +  we want to do this only if units in other languages do not
> > > +  contain structurally equivalent type.
> > > +
> > > +  Because SCC components are streamed in random (hash) order
> > > +  we may have encountered the type before while registering
> > > +  type canonical of a derived type in the same SCC.  */
> > > +      if (!TYPE_CANONICAL (t))
> > > + {
> > > +   if (!RECORD_OR_UNION_TYPE_P (t)
> > > +       || !TYPE_CXX_ODR_P (t))
> > > +     gimple_register_canonical_type (t);
> > > +   else if (COMPLETE_TYPE_P (t))
> > > +     vec_safe_push (types_to_register, t);
> > > + }
> > > +      if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t))
> > > + register_odr_type (t);
> > > +    }
> > > +  /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
> > > +     type which is also member of this SCC.  */
> > > +  if (TREE_CODE (t) == INTEGER_CST
> > > +      && !TREE_OVERFLOW (t))
> > > +    cache_integer_cst (t);
> > > +  if (!flag_ltrans)
> > > +    {
> > > +      lto_maybe_register_decl (data_in, t, index);
> > > +      /* Scan the tree for references to global functions or
> > > +  variables and record those for later fixup.  */
> > > +      if (mentions_vars_p (t))
> > > + vec_safe_push (tree_with_vars, t);
> > > +    }
> > > +}
> > >  
> > >  /* Read all the symbols from buffer DATA, using descriptors in DECL_DATA.
> > >     RESOLUTIONS is the set of symbols picked by the linker (read from the
> > > @@ -1813,7 +1879,6 @@ lto_read_decls (struct lto_file_decl_data 
> > > *decl_data, const void *data,
> > >    /* We do not uniquify the pre-loaded cache entries, those are 
> > > middle-end
> > >       internal types that should not be merged.  */
> > >  
> > > -  typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash;
> > >    hash_map <code_id_hash, unsigned> hm;
> > >    unsigned total = 0;
> > >  
> > > @@ -1824,31 +1889,41 @@ lto_read_decls (struct lto_file_decl_data 
> > > *decl_data, const void *data,
> > >        unsigned from = data_in->reader_cache->nodes.length ();
> > >        /* Read and uniquify SCCs as in the input stream.  */
> > >        enum LTO_tags tag = streamer_read_record_start (&ib_main);
> > > -      if (tag == LTO_tree_scc)
> > > +      if (tag == LTO_tree_scc || tag == LTO_trees)
> > >   {
> > >     unsigned len_;
> > >     unsigned scc_entry_len;
> > > +
> > > +   /* Because we stream in SCC order we know that all unshared trees
> > > +      are now fully streamed.  Process them.  */
> > >     hashval_t scc_hash = lto_input_scc (&ib_main, data_in, &len_,
> > > -                                       &scc_entry_len);
> > > +                                       &scc_entry_len,
> > > +                                       tag == LTO_tree_scc);
> > >     unsigned len = data_in->reader_cache->nodes.length () - from;
> > >     gcc_assert (len == len_);
> > >  
> > > -   total_scc_size += len;
> > > -   num_sccs_read++;
> > > +   if (tag == LTO_tree_scc)
> > > +     {
> > > +       total_scc_size += len;
> > > +       num_sccs_read++;
> > > +     }
> > > +   else
> > > +     num_unshared_trees_read += len;
> > >  
> > >     /* We have the special case of size-1 SCCs that are pre-merged
> > >        by means of identifier and string sharing for example.
> > >        ???  Maybe we should avoid streaming those as SCCs.  */
> > >     tree first = streamer_tree_cache_get_tree (data_in->reader_cache,
> > >                                                from);
> > > -   if (len == 1
> > > -       && (TREE_CODE (first) == IDENTIFIER_NODE
> > > -           || (TREE_CODE (first) == INTEGER_CST
> > > -               && !TREE_OVERFLOW (first))))
> > > -     continue;
> > > +   /* Identifier and integers are shared specially, they should never
> > > +      go by the tree merging path.  */
> > > +   gcc_checking_assert ((TREE_CODE (first) != IDENTIFIER_NODE
> > > +                         && (TREE_CODE (first) != INTEGER_CST
> > > +                             || TREE_OVERFLOW (first)))
> > > +                        || len != 1);
> > >  
> > >     /* Try to unify the SCC with already existing ones.  */
> > > -   if (!flag_ltrans
> > > +   if (!flag_ltrans && tag != LTO_trees
> > >         && unify_scc (data_in, from,
> > >                       len, scc_entry_len, scc_hash))
> > >       continue;
> > > @@ -1862,56 +1937,9 @@ lto_read_decls (struct lto_file_decl_data 
> > > *decl_data, const void *data,
> > >       {
> > >         tree t = streamer_tree_cache_get_tree (data_in->reader_cache,
> > >                                                from + i);
> > > -       /* Reconstruct the type variant and pointer-to/reference-to
> > > -          chains.  */
> > > +       process_new_tree (t, &hm, from + i, &total, data_in);
> > >         if (TYPE_P (t))
> > > -         {
> > > -           /* Map the tree types to their frequencies.  */
> > > -           if (flag_lto_dump_type_stats)
> > > -             {
> > > -               unsigned key = (unsigned) TREE_CODE (t);
> > > -               unsigned *countp = hm.get (key);
> > > -               hm.put (key, countp ? (*countp) + 1 : 1);
> > > -               total++;
> > > -             }
> > > -
> > > -           seen_type = true;
> > > -           num_prevailing_types++;
> > > -           lto_fixup_prevailing_type (t);
> > > -
> > > -           /* Compute the canonical type of all non-ODR types.
> > > -              Delay ODR types for the end of merging process - the 
> > > canonical
> > > -              type for those can be computed using the (unique) name 
> > > however
> > > -              we want to do this only if units in other languages do not
> > > -              contain structurally equivalent type.
> > > -
> > > -              Because SCC components are streamed in random (hash) order
> > > -              we may have encountered the type before while registering
> > > -              type canonical of a derived type in the same SCC.  */
> > > -           if (!TYPE_CANONICAL (t))
> > > -             {
> > > -               if (!RECORD_OR_UNION_TYPE_P (t)
> > > -                   || !TYPE_CXX_ODR_P (t))
> > > -                 gimple_register_canonical_type (t);
> > > -               else if (COMPLETE_TYPE_P (t))
> > > -                 vec_safe_push (types_to_register, t);
> > > -             }
> > > -           if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t))
> > > -             register_odr_type (t);
> > > -         }
> > > -       /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its
> > > -          type which is also member of this SCC.  */
> > > -       if (TREE_CODE (t) == INTEGER_CST
> > > -           && !TREE_OVERFLOW (t))
> > > -         cache_integer_cst (t);
> > > -       if (!flag_ltrans)
> > > -         {
> > > -           lto_maybe_register_decl (data_in, t, from + i);
> > > -           /* Scan the tree for references to global functions or
> > > -              variables and record those for later fixup.  */
> > > -           if (mentions_vars_p (t))
> > > -             vec_safe_push (tree_with_vars, t);
> > > -         }
> > > +         seen_type = true;
> > >       }
> > >  
> > >     /* Register DECLs with the debuginfo machinery.  */
> > > @@ -1926,9 +1954,26 @@ lto_read_decls (struct lto_file_decl_data 
> > > *decl_data, const void *data,
> > >   }
> > >        else
> > >   {
> > > -   /* Pickle stray references.  */
> > >     t = lto_input_tree_1 (&ib_main, data_in, tag, 0);
> > > -   gcc_assert (t && data_in->reader_cache->nodes.length () == from);
> > > +   /* We streamed in new tree.  Add it to cache and process dref.  */
> > > +   if (data_in->reader_cache->nodes.length () == from + 1)
> > > +     {
> > > +       num_unshared_trees_read++;
> > > +       data_in->location_cache.accept_location_cache ();
> > > +       process_dref (data_in, t, from);
> > > +       if (TREE_CODE (t) == IDENTIFIER_NODE
> > > +           || (TREE_CODE (t) == INTEGER_CST
> > > +               && !TREE_OVERFLOW (t)))
> > > +         ;
> > > +       else
> > > +         {
> > > +           lto_maybe_register_decl (data_in, t, from);
> > > +           process_new_tree (t, &hm, from, &total, data_in);
> > > +         }
> > > +     }
> > > +   else
> > > +     /* FIXME: It seems useless to pickle stray references.  */
> > > +     gcc_assert (data_in->reader_cache->nodes.length () == from);
> > >   }
> > >      }
> > >  
> > > @@ -2953,10 +2998,13 @@ print_lto_report_1 (void)
> > >    const char *pfx = (flag_lto) ? "LTO" : (flag_wpa) ? "WPA" : "LTRANS";
> > >    fprintf (stderr, "%s statistics\n", pfx);
> > >  
> > > -  fprintf (stderr, "[%s] read %lu SCCs of average size %f\n",
> > > +  fprintf (stderr, "[%s] read %lu unshared trees\n",
> > > +    pfx, num_unshared_trees_read);
> > > +  fprintf (stderr, "[%s] read %lu mergeable SCCs of average size %f\n",
> > >      pfx, num_sccs_read, total_scc_size / (double)num_sccs_read);
> > > -  fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx, 
> > > total_scc_size);
> > > -  if (flag_wpa && tree_scc_hash)
> > > +  fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx,
> > > +    total_scc_size + num_unshared_trees_read);
> > > +  if (flag_wpa && tree_scc_hash && num_sccs_read)
> > >      {
> > >        fprintf (stderr, "[%s] tree SCC table: size %ld, %ld elements, "
> > >          "collision ratio: %f\n", pfx,
> > > 
> > 
> > -- 
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to