On Wed, 20 May 2020, Jan Hubicka wrote: > Hi, > > > > The patch looks reasonable - the wins are not entirely clear to me > > since you mix in LTO streaming format optimizations. You promised > > to improve collisions on the SCC merging side but those are not > > spectacular (if present at all) in the above data? > > Thanks for looking into that. Collision ratios was imporving on Firefox > that has more anonymous types and their derivations, but I was currently > seeing an ICE building it. I will debug that. > > I think, quite surprisingly, most stream size benefits come from not > streaming hash codes for integer_csts and indentifier_nodes. > These are common and hash codes are large + hard to compress. > > I was thinking about trying making them artifically smaller (32bit only) > once we nail down the unecessary scc chains. I will look into that now > - last time i did stats it was mostly anonymous types and that is why I > started to worry about them first. > > > > The patch itself looks really nice. The only comment I have is > > with respect to the new local_tree_p predicate and its relation > > to tree_is_indexable. We should be able to assert that > > local trees are not indexable and indexable trees never are local. > > This is not true. Anonymous namespace types are indexable (since they > are shared across function bodies) but they are local to translation > unit. Similarly non-public decls. > > > Does this make the predicates equal? ;) It would be nice if > > they would cross-reference each other (maybe even assert at least > > in one way), maybe put next to each other and share parts? > > We can do this as followup. > > If you are OK doing more stuff as folowup, I would be happy. I think it > is good base to fix other issues (track down longer chain in merging; > sanify scc streaming more etc). > > I just tried to avoid snowballing, since the patch already mixes few > things together.
Yeah, let's do everything else as followup - the patch as-is looks good to me. Thanks, Richard. > Thanks, > Honza > > > > Thanks, > > Richard. > > > > > gcc/ChangeLog: > > > > > > 2020-05-19 Jan Hubicka <hubi...@ucw.cz> > > > > > > * lto-streamer-in.c (lto_input_scc): Add SHARED_SCC parameter. > > > (lto_input_tree_1): Strenghten sanity check. > > > (lto_input_tree): Update call of lto_input_scc. > > > * lto-streamer-out.c: Include ipa-utils.h > > > (create_output_block): Initialize local_trees if merigng is going > > > to happen. > > > (destroy_output_block): Destroy local_trees. > > > (DFS): Add max_local_entry. > > > (local_tree_p): New function. > > > (DFS::DFS): Initialize and maintain it. > > > (DFS::DFS_write_tree): Decide on streaming format. > > > (lto_output_tree): Stream inline singleton SCCs > > > * lto-streamer.h (enum LTO_tags): Add LTO_trees. > > > (struct output_block): Add local_trees. > > > (lto_input_scc): Update prototype. > > > > > > gcc/lto/ChangeLog: > > > > > > 2020-05-19 Jan Hubicka <hubi...@ucw.cz> > > > > > > * lto-common.c (compare_tree_sccs_1): Sanity check that we never > > > read TRANSLATION_UNIT_DECL. > > > (process_dref): Break out from ... > > > (unify_scc): ... here. > > > (process_new_tree): Break out from ... > > > (lto_read_decls): ... here; handle streaming of singleton trees. > > > (print_lto_report_1): Update statistics. > > > > > > diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c > > > index 244f5b8aa5c..6d571c4344a 100644 > > > --- a/gcc/lto-streamer-in.c > > > +++ b/gcc/lto-streamer-in.c > > > @@ -1424,16 +1424,17 @@ lto_read_tree (class lto_input_block *ib, class > > > data_in *data_in, > > > > > > > > > /* Populate the reader cache with trees materialized from the SCC > > > - following in the IB, DATA_IN stream. */ > > > + following in the IB, DATA_IN stream. > > > + If SHARED_SCC is true we input LTO_tree_scc. */ > > > > > > hashval_t > > > lto_input_scc (class lto_input_block *ib, class data_in *data_in, > > > - unsigned *len, unsigned *entry_len) > > > + unsigned *len, unsigned *entry_len, bool shared_scc) > > > { > > > /* A blob of unnamed tree nodes, fill the cache from it and > > > recurse. */ > > > unsigned size = streamer_read_uhwi (ib); > > > - hashval_t scc_hash = streamer_read_uhwi (ib); > > > + hashval_t scc_hash = shared_scc ? streamer_read_uhwi (ib) : 0; > > > unsigned scc_entry_len = 1; > > > > > > if (size == 1) > > > @@ -1456,7 +1457,8 @@ lto_input_scc (class lto_input_block *ib, class > > > data_in *data_in, > > > || (tag >= LTO_field_decl_ref && tag <= LTO_global_decl_ref) > > > || tag == LTO_tree_pickle_reference > > > || tag == LTO_integer_cst > > > - || tag == LTO_tree_scc) > > > + || tag == LTO_tree_scc > > > + || tag == LTO_trees) > > > gcc_unreachable (); > > > > > > result = streamer_alloc_tree (ib, data_in, tag); > > > @@ -1522,7 +1524,7 @@ lto_input_tree_1 (class lto_input_block *ib, class > > > data_in *data_in, > > > (a, len, TYPE_PRECISION (type))); > > > streamer_tree_cache_append (data_in->reader_cache, result, hash); > > > } > > > - else if (tag == LTO_tree_scc) > > > + else if (tag == LTO_tree_scc || tag == LTO_trees) > > > gcc_unreachable (); > > > else > > > { > > > @@ -1538,11 +1540,11 @@ lto_input_tree (class lto_input_block *ib, class > > > data_in *data_in) > > > { > > > enum LTO_tags tag; > > > > > > - /* Input and skip SCCs. */ > > > - while ((tag = streamer_read_record_start (ib)) == LTO_tree_scc) > > > + /* Input pickled trees needed to stream in the refrence. */ > > > + while ((tag = streamer_read_record_start (ib)) == LTO_trees) > > > { > > > unsigned len, entry_len; > > > - lto_input_scc (ib, data_in, &len, &entry_len); > > > + lto_input_scc (ib, data_in, &len, &entry_len, false); > > > > > > /* Register DECLs with the debuginfo machinery. */ > > > while (!dref_queue.is_empty ()) > > > @@ -1551,7 +1553,15 @@ lto_input_tree (class lto_input_block *ib, class > > > data_in *data_in) > > > debug_hooks->register_external_die (e.decl, e.sym, e.off); > > > } > > > } > > > - return lto_input_tree_1 (ib, data_in, tag, 0); > > > + tree t = lto_input_tree_1 (ib, data_in, tag, 0); > > > + > > > + if (!dref_queue.is_empty ()) > > > + { > > > + dref_entry e = dref_queue.pop (); > > > + debug_hooks->register_external_die (e.decl, e.sym, e.off); > > > + gcc_checking_assert (dref_queue.is_empty ()); > > > + } > > > + return t; > > > } > > > > > > > > > diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c > > > index 3d94324881f..1c5b8f42c9b 100644 > > > --- a/gcc/lto-streamer-out.c > > > +++ b/gcc/lto-streamer-out.c > > > @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see > > > #include "tree-dfa.h" > > > #include "file-prefix-map.h" /* remap_debug_filename() */ > > > #include "output.h" > > > +#include "ipa-utils.h" > > > > > > > > > static void lto_write_tree (struct output_block*, tree, bool); > > > @@ -75,6 +76,10 @@ create_output_block (enum lto_section_type > > > section_type) > > > > > > ob->section_type = section_type; > > > ob->decl_state = lto_get_out_decl_state (); > > > + /* Only global decl stream in non-wpa will ever be considered by tree > > > + merging. */ > > > + if (!flag_wpa && section_type == LTO_section_decls) > > > + ob->local_trees = new (hash_set <tree>); > > > ob->main_stream = XCNEW (struct lto_output_stream); > > > ob->string_stream = XCNEW (struct lto_output_stream); > > > ob->writer_cache = streamer_tree_cache_create (!flag_wpa, true, false); > > > @@ -100,6 +105,7 @@ destroy_output_block (struct output_block *ob) > > > > > > delete ob->string_hash_table; > > > ob->string_hash_table = NULL; > > > + delete ob->local_trees; > > > > > > free (ob->main_stream); > > > free (ob->string_stream); > > > @@ -532,6 +538,8 @@ private: > > > bool ref_p; > > > bool this_ref_p; > > > }; > > > + /* Maximum index of scc stack containing a local tree. */ > > > + int max_local_entry; > > > > > > static int scc_entry_compare (const void *, const void *); > > > > > > @@ -550,6 +558,33 @@ private: > > > struct obstack sccstate_obstack; > > > }; > > > > > > +/* Return true if type can be merged. > > > + TRANSLATION_UNIT_DECL is handled specially since references to it does > > > + not make other trees local as well. */ > > > + > > > +static bool > > > +local_tree_p (tree t) > > > +{ > > > + switch (TREE_CODE (t)) > > > + { > > > + case LABEL_DECL: > > > + return true; > > > + case NAMESPACE_DECL: > > > + return !DECL_NAME (t); > > > + case VAR_DECL: > > > + case FUNCTION_DECL: > > > + return !TREE_PUBLIC (t) && !DECL_EXTERNAL (t); > > > + case RECORD_TYPE: > > > + case UNION_TYPE: > > > + case ENUMERAL_TYPE: > > > + return TYPE_MAIN_VARIANT (t) == t > > > + && odr_type_p (t) && type_with_linkage_p (t) > > > + && type_in_anonymous_namespace_p (t); > > > + default: > > > + return false; > > > + } > > > +} > > > + > > > /* Emit the physical representation of tree node EXPR to output block OB, > > > using depth-first search on the subgraph. If THIS_REF_P is true, the > > > leaves of EXPR are emitted as references via lto_output_tree_ref. > > > @@ -560,6 +595,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool > > > ref_p, bool this_ref_p, > > > bool single_p) > > > { > > > unsigned int next_dfs_num = 1; > > > + > > > + max_local_entry = -1; > > > gcc_obstack_init (&sccstate_obstack); > > > DFS_write_tree (ob, NULL, expr, ref_p, this_ref_p); > > > while (!worklist_vec.is_empty ()) > > > @@ -586,6 +623,8 @@ DFS::DFS (struct output_block *ob, tree expr, bool > > > ref_p, bool this_ref_p, > > > scc_entry e = { expr, 0 }; > > > /* Not yet visited. DFS recurse and push it onto the stack. */ > > > *slot = cstate = XOBNEW (&sccstate_obstack, struct sccs); > > > + if (ob->local_trees && local_tree_p (expr)) > > > + max_local_entry = sccstack.length (); > > > sccstack.safe_push (e); > > > cstate->dfsnum = next_dfs_num++; > > > cstate->low = cstate->dfsnum; > > > @@ -640,7 +679,26 @@ DFS::DFS (struct output_block *ob, tree expr, bool > > > ref_p, bool this_ref_p, > > > any merging there. */ > > > hashval_t scc_hash = 0; > > > unsigned scc_entry_len = 0; > > > - if (!flag_wpa) > > > + bool local_to_unit = !ob->local_trees > > > + || max_local_entry >= (int)first; > > > + > > > + /* Remember that trees are local so info gets propagated to other > > > + SCCs. */ > > > + if (local_to_unit && ob->local_trees) > > > + { > > > + for (unsigned i = 0; i < size; ++i) > > > + ob->local_trees->add (sccstack[first + i].t); > > > + } > > > + > > > + /* As a special case do not stream TRANSLATION_UNIT_DECL as shared > > > + tree. We can not mark it local because references to it does not > > > + make other trees local (all global decls reffer to it via > > > + CONTEXT). */ > > > + if (size == 1 > > > + && TREE_CODE (sccstack[first].t) == TRANSLATION_UNIT_DECL) > > > + local_to_unit = true; > > > + > > > + if (!local_to_unit) > > > { > > > scc_hash = hash_scc (ob, first, size, ref_p, this_ref_p); > > > > > > @@ -672,10 +730,44 @@ DFS::DFS (struct output_block *ob, tree expr, bool > > > ref_p, bool this_ref_p, > > > gcc_checking_assert (scc_entry_len == 1); > > > } > > > > > > - /* Write LTO_tree_scc. */ > > > - streamer_write_record_start (ob, LTO_tree_scc); > > > - streamer_write_uhwi (ob, size); > > > - streamer_write_uhwi (ob, scc_hash); > > > + worklist_vec.pop (); > > > + > > > + /* If we are not streaming global section tree sharing will not > > > + be done, but we must mark block of trees so streamer can > > > + distinguish it from reference being streamed. */ > > > + if (ob->section_type != LTO_section_decls) > > > + { > > > + /* If this is the original tree we stream and it forms SCC > > > + by itself then we do not need to stream SCC at all. */ > > > + if (worklist_vec.is_empty () && first == 0 && size == 1) > > > + return; > > > + streamer_write_record_start (ob, LTO_trees); > > > + streamer_write_uhwi (ob, size); > > > + } > > > + /* Write LTO_tree_scc if sharing is going to be performned. */ > > > + else if (!local_to_unit > > > + /* These are special since sharing is not done by tree > > > + merging machinery. We can not special case them earlier > > > + because we still need to compute hash for further sharing > > > + of trees referring to them. */ > > > + && (size != 1 > > > + || (TREE_CODE (sccstack[first].t) != IDENTIFIER_NODE > > > + && (TREE_CODE (sccstack[first].t) != INTEGER_CST > > > + || TREE_OVERFLOW (sccstack[first].t))))) > > > + > > > + { > > > + gcc_checking_assert (ob->section_type == LTO_section_decls); > > > + streamer_write_record_start (ob, LTO_tree_scc); > > > + streamer_write_uhwi (ob, size); > > > + streamer_write_uhwi (ob, scc_hash); > > > + } > > > + /* Non-trivial SCCs must be packed to trees blocks so forward > > > + references work correctly. */ > > > + else if (size != 1) > > > + { > > > + streamer_write_record_start (ob, LTO_trees); > > > + streamer_write_uhwi (ob, size); > > > + } > > > > > > /* Write size-1 SCCs without wrapping them inside SCC bundles. > > > All INTEGER_CSTs need to be handled this way as we need > > > @@ -722,10 +814,11 @@ DFS::DFS (struct output_block *ob, tree expr, bool > > > ref_p, bool this_ref_p, > > > > > > /* Finally truncate the vector. */ > > > sccstack.truncate (first); > > > + if ((int)first <= max_local_entry) > > > + max_local_entry = first - 1; > > > > > > if (from_state) > > > from_state->low = MIN (from_state->low, cstate->low); > > > - worklist_vec.pop (); > > > continue; > > > } > > > > > > @@ -1569,7 +1662,14 @@ DFS::DFS_write_tree (struct output_block *ob, sccs > > > *from_state, > > > > > > /* Check if we already streamed EXPR. */ > > > if (streamer_tree_cache_lookup (ob->writer_cache, expr, NULL)) > > > - return; > > > + { > > > + /* Refernece to a local tree makes entry also local. We always > > > process > > > + top of stack entry, so set max to number of entries in stack - 1. */ > > > + if (ob->local_trees > > > + && ob->local_trees->contains (expr)) > > > + max_local_entry = sccstack.length () - 1; > > > + return; > > > + } > > > > > > worklist w; > > > w.expr = expr; > > > @@ -1641,15 +1741,20 @@ lto_output_tree (struct output_block *ob, tree > > > expr, > > > DFS (ob, expr, ref_p, this_ref_p, false); > > > in_dfs_walk = false; > > > > > > - /* Finally append a reference to the tree we were writing. > > > - ??? If expr ended up as a singleton we could have > > > - inlined it here and avoid outputting a reference. */ > > > + /* Finally append a reference to the tree we were writing. */ > > > existed_p = streamer_tree_cache_lookup (ob->writer_cache, expr, > > > &ix); > > > - gcc_assert (existed_p); > > > - streamer_write_record_start (ob, LTO_tree_pickle_reference); > > > - streamer_write_uhwi (ob, ix); > > > - streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS, > > > - lto_tree_code_to_tag (TREE_CODE (expr))); > > > + > > > + /* DFS walk above possibly skipped streaming EXPR itself to let us > > > inline > > > + it. */ > > > + if (!existed_p) > > > + lto_output_tree_1 (ob, expr, 0, ref_p, this_ref_p); > > > + else > > > + { > > > + streamer_write_record_start (ob, LTO_tree_pickle_reference); > > > + streamer_write_uhwi (ob, ix); > > > + streamer_write_enum (ob->main_stream, LTO_tags, LTO_NUM_TAGS, > > > + lto_tree_code_to_tag (TREE_CODE (expr))); > > > + } > > > if (streamer_dump_file) > > > { > > > print_node_brief (streamer_dump_file, " Finished SCC of ", > > > diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h > > > index 76aa6fe34b8..a466fb8b329 100644 > > > --- a/gcc/lto-streamer.h > > > +++ b/gcc/lto-streamer.h > > > @@ -178,6 +178,9 @@ enum LTO_tags > > > /* Special for global streamer. A blob of unnamed tree nodes. */ > > > LTO_tree_scc, > > > > > > + /* Sequence of trees. */ > > > + LTO_trees, > > > + > > > /* References to indexable tree nodes. These objects are stored in > > > tables that are written separately from the function bodies that > > > reference them. This way they can be instantiated even when the > > > @@ -751,6 +754,9 @@ struct output_block > > > /* Cache of nodes written in this section. */ > > > struct streamer_tree_cache_d *writer_cache; > > > > > > + /* All trees identified as local to the unit streamed. */ > > > + hash_set<tree> *local_trees; > > > + > > > /* All data persistent across whole duration of output block > > > can go here. */ > > > struct obstack obstack; > > > @@ -901,7 +907,7 @@ tree lto_input_tree_ref (class lto_input_block *, > > > class data_in *, > > > void lto_tag_check_set (enum LTO_tags, int, ...); > > > void lto_init_eh (void); > > > hashval_t lto_input_scc (class lto_input_block *, class data_in *, > > > - unsigned *, unsigned *); > > > + unsigned *, unsigned *, bool); > > > tree lto_input_tree_1 (class lto_input_block *, class data_in *, > > > enum LTO_tags, hashval_t hash); > > > tree lto_input_tree (class lto_input_block *, class data_in *); > > > diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c > > > index 682a82d61ba..d04b1c9ca7b 100644 > > > --- a/gcc/lto/lto-common.c > > > +++ b/gcc/lto/lto-common.c > > > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see > > > #include "builtins.h" > > > #include "lto-common.h" > > > #include "tree-pretty-print.h" > > > +#include "print-tree.h" > > > > > > /* True when no new types are going to be streamd from the global > > > stream. */ > > > > > > @@ -1054,6 +1055,7 @@ static unsigned long num_prevailing_types; > > > static unsigned long num_type_scc_trees; > > > static unsigned long total_scc_size; > > > static unsigned long num_sccs_read; > > > +static unsigned long num_unshared_trees_read; > > > static unsigned long total_scc_size_merged; > > > static unsigned long num_sccs_merged; > > > static unsigned long num_scc_compares; > > > @@ -1088,6 +1090,10 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map) > > > compare_values (TREE_CODE); > > > code = TREE_CODE (t1); > > > > > > + /* If we end up comparing translation unit decls we either forgot to > > > mark > > > + some SCC as local or we compare too much. */ > > > + gcc_checking_assert (code != TRANSLATION_UNIT_DECL); > > > + > > > if (!TYPE_P (t1)) > > > { > > > compare_values (TREE_SIDE_EFFECTS); > > > @@ -1626,6 +1632,28 @@ cmp_tree (const void *p1_, const void *p2_) > > > return ((uintptr_t)p1[1] < (uintptr_t)p2[1]) ? -1 : 1; > > > } > > > > > > +/* New scc of size 1 containing T was streamed in from DATA_IN and not > > > merged. > > > + Register it to reader cache at index FROM. */ > > > + > > > +static void > > > +process_dref (class data_in *data_in, tree t, unsigned from) > > > +{ > > > + struct streamer_tree_cache_d *cache = data_in->reader_cache; > > > + /* If we got a debug reference queued, see if the prevailing > > > + tree has a debug reference and if not, register the one > > > + for the tree we are about to throw away. */ > > > + if (dref_queue.length () == 1) > > > + { > > > + dref_entry e = dref_queue.pop (); > > > + gcc_assert (e.decl > > > + == streamer_tree_cache_get_tree (cache, from)); > > > + const char *sym; > > > + unsigned HOST_WIDE_INT off; > > > + if (!debug_hooks->die_ref_for_decl (t, &sym, &off)) > > > + debug_hooks->register_external_die (t, e.sym, e.off); > > > + } > > > +} > > > + > > > /* Try to unify the SCC with nodes FROM to FROM + LEN in CACHE and > > > hash value SCC_HASH with an already recorded SCC. Return true if > > > that was successful, otherwise return false. */ > > > @@ -1646,22 +1674,16 @@ unify_scc (class data_in *data_in, unsigned from, > > > { > > > tree t = streamer_tree_cache_get_tree (cache, from + i); > > > scc->entries[i] = t; > > > - /* Do not merge SCCs with local entities inside them. Also do > > > - not merge TRANSLATION_UNIT_DECLs and anonymous namespaces > > > - and types therein types. */ > > > - if (TREE_CODE (t) == TRANSLATION_UNIT_DECL > > > - || (VAR_OR_FUNCTION_DECL_P (t) > > > - && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t))) > > > - || TREE_CODE (t) == LABEL_DECL > > > - || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t)) > > > - || (TYPE_P (t) > > > - && type_with_linkage_p (TYPE_MAIN_VARIANT (t)) > > > - && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t)))) > > > - { > > > - /* Avoid doing any work for these cases and do not worry to > > > - record the SCCs for further merging. */ > > > - return false; > > > - } > > > + /* These types should be streamed as unshared. */ > > > + gcc_checking_assert > > > + (!(TREE_CODE (t) == TRANSLATION_UNIT_DECL > > > + || (VAR_OR_FUNCTION_DECL_P (t) > > > + && !(TREE_PUBLIC (t) || DECL_EXTERNAL (t))) > > > + || TREE_CODE (t) == LABEL_DECL > > > + || (TREE_CODE (t) == NAMESPACE_DECL && !DECL_NAME (t)) > > > + || (TYPE_P (t) > > > + && type_with_linkage_p (TYPE_MAIN_VARIANT (t)) > > > + && type_in_anonymous_namespace_p (TYPE_MAIN_VARIANT (t))))); > > > } > > > > > > /* Look for the list of candidate SCCs to compare against. */ > > > @@ -1712,21 +1734,7 @@ unify_scc (class data_in *data_in, unsigned from, > > > to the tree node mapping computed by compare_tree_sccs. */ > > > if (len == 1) > > > { > > > - /* If we got a debug reference queued, see if the prevailing > > > - tree has a debug reference and if not, register the one > > > - for the tree we are about to throw away. */ > > > - if (dref_queue.length () == 1) > > > - { > > > - dref_entry e = dref_queue.pop (); > > > - gcc_assert (e.decl > > > - == streamer_tree_cache_get_tree (cache, from)); > > > - const char *sym; > > > - unsigned HOST_WIDE_INT off; > > > - if (!debug_hooks->die_ref_for_decl (pscc->entries[0], &sym, > > > - &off)) > > > - debug_hooks->register_external_die (pscc->entries[0], > > > - e.sym, e.off); > > > - } > > > + process_dref (data_in, pscc->entries[0], from); > > > lto_maybe_register_decl (data_in, pscc->entries[0], from); > > > streamer_tree_cache_replace_tree (cache, pscc->entries[0], from); > > > } > > > @@ -1785,7 +1793,65 @@ unify_scc (class data_in *data_in, unsigned from, > > > return unified_p; > > > } > > > > > > +typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash; > > > + > > > +/* Do registering necessary once new tree fully streamed in (including > > > all > > > + trees it reffers to). */ > > > + > > > +static void > > > +process_new_tree (tree t, hash_map <code_id_hash, unsigned> *hm, > > > + unsigned index, unsigned *total, class data_in *data_in) > > > +{ > > > + /* Reconstruct the type variant and pointer-to/reference-to > > > + chains. */ > > > + if (TYPE_P (t)) > > > + { > > > + /* Map the tree types to their frequencies. */ > > > + if (flag_lto_dump_type_stats) > > > + { > > > + unsigned key = (unsigned) TREE_CODE (t); > > > + unsigned *countp = hm->get (key); > > > + hm->put (key, countp ? (*countp) + 1 : 1); > > > + (*total)++; > > > + } > > > + > > > + num_prevailing_types++; > > > + lto_fixup_prevailing_type (t); > > > > > > + /* Compute the canonical type of all non-ODR types. > > > + Delay ODR types for the end of merging process - the canonical > > > + type for those can be computed using the (unique) name however > > > + we want to do this only if units in other languages do not > > > + contain structurally equivalent type. > > > + > > > + Because SCC components are streamed in random (hash) order > > > + we may have encountered the type before while registering > > > + type canonical of a derived type in the same SCC. */ > > > + if (!TYPE_CANONICAL (t)) > > > + { > > > + if (!RECORD_OR_UNION_TYPE_P (t) > > > + || !TYPE_CXX_ODR_P (t)) > > > + gimple_register_canonical_type (t); > > > + else if (COMPLETE_TYPE_P (t)) > > > + vec_safe_push (types_to_register, t); > > > + } > > > + if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t)) > > > + register_odr_type (t); > > > + } > > > + /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its > > > + type which is also member of this SCC. */ > > > + if (TREE_CODE (t) == INTEGER_CST > > > + && !TREE_OVERFLOW (t)) > > > + cache_integer_cst (t); > > > + if (!flag_ltrans) > > > + { > > > + lto_maybe_register_decl (data_in, t, index); > > > + /* Scan the tree for references to global functions or > > > + variables and record those for later fixup. */ > > > + if (mentions_vars_p (t)) > > > + vec_safe_push (tree_with_vars, t); > > > + } > > > +} > > > > > > /* Read all the symbols from buffer DATA, using descriptors in DECL_DATA. > > > RESOLUTIONS is the set of symbols picked by the linker (read from the > > > @@ -1813,7 +1879,6 @@ lto_read_decls (struct lto_file_decl_data > > > *decl_data, const void *data, > > > /* We do not uniquify the pre-loaded cache entries, those are > > > middle-end > > > internal types that should not be merged. */ > > > > > > - typedef int_hash<unsigned, 0, UINT_MAX> code_id_hash; > > > hash_map <code_id_hash, unsigned> hm; > > > unsigned total = 0; > > > > > > @@ -1824,31 +1889,41 @@ lto_read_decls (struct lto_file_decl_data > > > *decl_data, const void *data, > > > unsigned from = data_in->reader_cache->nodes.length (); > > > /* Read and uniquify SCCs as in the input stream. */ > > > enum LTO_tags tag = streamer_read_record_start (&ib_main); > > > - if (tag == LTO_tree_scc) > > > + if (tag == LTO_tree_scc || tag == LTO_trees) > > > { > > > unsigned len_; > > > unsigned scc_entry_len; > > > + > > > + /* Because we stream in SCC order we know that all unshared trees > > > + are now fully streamed. Process them. */ > > > hashval_t scc_hash = lto_input_scc (&ib_main, data_in, &len_, > > > - &scc_entry_len); > > > + &scc_entry_len, > > > + tag == LTO_tree_scc); > > > unsigned len = data_in->reader_cache->nodes.length () - from; > > > gcc_assert (len == len_); > > > > > > - total_scc_size += len; > > > - num_sccs_read++; > > > + if (tag == LTO_tree_scc) > > > + { > > > + total_scc_size += len; > > > + num_sccs_read++; > > > + } > > > + else > > > + num_unshared_trees_read += len; > > > > > > /* We have the special case of size-1 SCCs that are pre-merged > > > by means of identifier and string sharing for example. > > > ??? Maybe we should avoid streaming those as SCCs. */ > > > tree first = streamer_tree_cache_get_tree (data_in->reader_cache, > > > from); > > > - if (len == 1 > > > - && (TREE_CODE (first) == IDENTIFIER_NODE > > > - || (TREE_CODE (first) == INTEGER_CST > > > - && !TREE_OVERFLOW (first)))) > > > - continue; > > > + /* Identifier and integers are shared specially, they should never > > > + go by the tree merging path. */ > > > + gcc_checking_assert ((TREE_CODE (first) != IDENTIFIER_NODE > > > + && (TREE_CODE (first) != INTEGER_CST > > > + || TREE_OVERFLOW (first))) > > > + || len != 1); > > > > > > /* Try to unify the SCC with already existing ones. */ > > > - if (!flag_ltrans > > > + if (!flag_ltrans && tag != LTO_trees > > > && unify_scc (data_in, from, > > > len, scc_entry_len, scc_hash)) > > > continue; > > > @@ -1862,56 +1937,9 @@ lto_read_decls (struct lto_file_decl_data > > > *decl_data, const void *data, > > > { > > > tree t = streamer_tree_cache_get_tree (data_in->reader_cache, > > > from + i); > > > - /* Reconstruct the type variant and pointer-to/reference-to > > > - chains. */ > > > + process_new_tree (t, &hm, from + i, &total, data_in); > > > if (TYPE_P (t)) > > > - { > > > - /* Map the tree types to their frequencies. */ > > > - if (flag_lto_dump_type_stats) > > > - { > > > - unsigned key = (unsigned) TREE_CODE (t); > > > - unsigned *countp = hm.get (key); > > > - hm.put (key, countp ? (*countp) + 1 : 1); > > > - total++; > > > - } > > > - > > > - seen_type = true; > > > - num_prevailing_types++; > > > - lto_fixup_prevailing_type (t); > > > - > > > - /* Compute the canonical type of all non-ODR types. > > > - Delay ODR types for the end of merging process - the > > > canonical > > > - type for those can be computed using the (unique) name > > > however > > > - we want to do this only if units in other languages do not > > > - contain structurally equivalent type. > > > - > > > - Because SCC components are streamed in random (hash) order > > > - we may have encountered the type before while registering > > > - type canonical of a derived type in the same SCC. */ > > > - if (!TYPE_CANONICAL (t)) > > > - { > > > - if (!RECORD_OR_UNION_TYPE_P (t) > > > - || !TYPE_CXX_ODR_P (t)) > > > - gimple_register_canonical_type (t); > > > - else if (COMPLETE_TYPE_P (t)) > > > - vec_safe_push (types_to_register, t); > > > - } > > > - if (TYPE_MAIN_VARIANT (t) == t && odr_type_p (t)) > > > - register_odr_type (t); > > > - } > > > - /* Link shared INTEGER_CSTs into TYPE_CACHED_VALUEs of its > > > - type which is also member of this SCC. */ > > > - if (TREE_CODE (t) == INTEGER_CST > > > - && !TREE_OVERFLOW (t)) > > > - cache_integer_cst (t); > > > - if (!flag_ltrans) > > > - { > > > - lto_maybe_register_decl (data_in, t, from + i); > > > - /* Scan the tree for references to global functions or > > > - variables and record those for later fixup. */ > > > - if (mentions_vars_p (t)) > > > - vec_safe_push (tree_with_vars, t); > > > - } > > > + seen_type = true; > > > } > > > > > > /* Register DECLs with the debuginfo machinery. */ > > > @@ -1926,9 +1954,26 @@ lto_read_decls (struct lto_file_decl_data > > > *decl_data, const void *data, > > > } > > > else > > > { > > > - /* Pickle stray references. */ > > > t = lto_input_tree_1 (&ib_main, data_in, tag, 0); > > > - gcc_assert (t && data_in->reader_cache->nodes.length () == from); > > > + /* We streamed in new tree. Add it to cache and process dref. */ > > > + if (data_in->reader_cache->nodes.length () == from + 1) > > > + { > > > + num_unshared_trees_read++; > > > + data_in->location_cache.accept_location_cache (); > > > + process_dref (data_in, t, from); > > > + if (TREE_CODE (t) == IDENTIFIER_NODE > > > + || (TREE_CODE (t) == INTEGER_CST > > > + && !TREE_OVERFLOW (t))) > > > + ; > > > + else > > > + { > > > + lto_maybe_register_decl (data_in, t, from); > > > + process_new_tree (t, &hm, from, &total, data_in); > > > + } > > > + } > > > + else > > > + /* FIXME: It seems useless to pickle stray references. */ > > > + gcc_assert (data_in->reader_cache->nodes.length () == from); > > > } > > > } > > > > > > @@ -2953,10 +2998,13 @@ print_lto_report_1 (void) > > > const char *pfx = (flag_lto) ? "LTO" : (flag_wpa) ? "WPA" : "LTRANS"; > > > fprintf (stderr, "%s statistics\n", pfx); > > > > > > - fprintf (stderr, "[%s] read %lu SCCs of average size %f\n", > > > + fprintf (stderr, "[%s] read %lu unshared trees\n", > > > + pfx, num_unshared_trees_read); > > > + fprintf (stderr, "[%s] read %lu mergeable SCCs of average size %f\n", > > > pfx, num_sccs_read, total_scc_size / (double)num_sccs_read); > > > - fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx, > > > total_scc_size); > > > - if (flag_wpa && tree_scc_hash) > > > + fprintf (stderr, "[%s] %lu tree bodies read in total\n", pfx, > > > + total_scc_size + num_unshared_trees_read); > > > + if (flag_wpa && tree_scc_hash && num_sccs_read) > > > { > > > fprintf (stderr, "[%s] tree SCC table: size %ld, %ld elements, " > > > "collision ratio: %f\n", pfx, > > > > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)