This patch introduces canonical types into GCC, which allow us to
compare two types very efficiently and results in an overall
compile-time performance improvement. I have been seeing 3-5%
improvements in compile time on the G++ and libstdc++ test suites,
5-10% on template-heavy (but realistic) code in Boost, and up to 85%
improvement for extremely template-heavy metaprogramming.
Canonical types also make the GCC type system more regular.  In GCC,
we often create many different variants of the same type that are all
considered equivalent (e.g., comptypes returns true for both). For
instance, "int" and a typedef of "int" will have separate INTEGER_TYPE
nodes, but these will compare equal when we're performing type
checking. Unfortunately, we need to perform a deep comparison,
checking cv-qualifiers, attributes, precision, alignment,
etc. Canonical types add a new field to every type which points to a
representative of the equivalent class of types: all types in GCC that
are equivalent will point to the same canonical type, so we can use
the canonical types to quickly check type equivalence. Canonical types
were discussed in this thread:

 http://gcc.gnu.org/ml/gcc/2006-11/msg00192.html

This patch adds two new fields to every _TYPE, and maintains those
fields throughout the C, C++, Objective-C and Objective-C++ front
ends. The first field is TYPE_CANONICAL, which is the canonical type
of each type. The second field is TYPE_STRUCTURAL_EQUALITY: when this
bit is set, it means that TYPE_CANONICAL cannot be used reliably for
type equality checking, so we must perform structural tests. We use
TYPE_STRUCTURAL_EQUALITY when we create a new type node for which we
can't find the canonical type. Over time, we should work to eliminate
TYPE_STRUCTURAL_EQUALITY, because each case that requires it also
means that we're wasting memory for duplicated type nodes and wasting
cycles performing deep comparisons.

There is some danger in introducing canonical types: if we fail to
canonicalize properly, types won't compare equal when they should, and
we'll break code. To mitigate this problem, I have added a flags
-f(no-)check-canonical-types. When enabled, GCC will check the
canonical types against the structural types; where they differ, it
will print a warning and use the structural information. When
disabled, we only use canonical types... and get better compile-time
performance. Canonical type checking is disabled by default for normal
builds, enabled by default when --enable-checking is present. Note,
however, that this flag is temporary: after a major release or two,
when we're sure that we have our canonical type system, we can
eliminate it.

Bootstrapped C, C++, Objective-C, Objective-C++, Java on
i686-pc-linux-gnu. Tested C, C++, Objective-C, Objective-C++, and
libstdc++ test suites; no new regressions. I've spot-checked some
template-heavy Boost libraries; no regressions there, either.

This is part one of three, containing changes to the parts of the
compiler that are shared amount the C family of languages.

Okay for mainline?

 Cheers,
 Doug Gregor
 Open Systems Lab @ Indiana University

2006-11-28  Douglas Gregor  <[EMAIL PROTECTED]>

        * builtins.c (std_gimplify_va_arg_expr): Keep track of the
        canonical type when building a variant with a different alignment.
        * c-common.c (flag_check_canonical_types): New.
        (c_common_nodes_and_builtins): Since variants of void_type_node
        get built before it is given a name, we need to give those
        variants the name, too.
        (handle_packed_attribute): When building the type variant, set the
        canonical type appropriately.
        (handle_unused_attribute): When building the type variant, set the
        canonical type appropriately.
        (handle_aligned_attribute): Ditto.
        (handle_deprecated_attribute): Ditto.
        (complete_array_type): We need to work with the canonical main
        type of the array, from which we will build the qualified version.
        * c-common.h (flag_check_canonical_types): New.
        * c.opt (fcheck-canonical-types): New.
        * c-opts.c (c_common_handle_option): Handle
        -f(no-)check-canonical-types.
        * print-tree.c (print_node): Display canonical type information
        for each type.
        * stor-layout.c (layout_type): When we don't know the
        alignment of a type for which we're building an array, we end up
        guessing wrong, so make the type require structural equality.
        * tree.c (make_node_stat): When we build a new type, it is its
        own canonical type.
        (build_type_attribute_qual_variant): When building an attribute
        variant, its canonical type is the non-attribute variant. However,
        if the attributes are target-dependent and they differ, we need to
        use structural equality checks for this type.
        (build_qualified_type): Ditto.
        (build_distinct_type_copy): When building a distinct type from
        another type, the new type is its own canonical type.
        (build_pointer_type_for_mode): When building a pointer type, also
        build a canonical type pointer.
        (build_reference_type_for_mode): When building a reference type,
        also build a canonical type reference.
        (build_index_type): When we can't hash an index type (e.g.,
        because its maximum value is negative), the index type requires
        structural equality tests.
        (build_array_type): Build the canonical form of an array type.
        (build_function_type): Function types require structural equality,
        because they contain default arguments, attributes, etc.
        (build_method_type_directly): Ditto for method types.
        (build_offset_type): Build the canonical offset type.
        (build_complex_type): Build the canonical vector type.
        (make_vector_type): Build the canonical vector type.
        (build_common_tree_nodes_2): Set the canonical type of
        va_list_type_node appropriately.
        * tree.h (TYPE_CANONICAL): New.
        (TYPE_STRUCTURAL_EQUALITY): New.
        (struct tree_type): Added structural_equality, unused_bits,
        canonical fields.
Index: builtins.c
===================================================================
--- builtins.c	(revision 119271)
+++ builtins.c	(working copy)
@@ -4407,8 +4407,10 @@ std_gimplify_va_arg_expr (tree valist, t
   boundary *= BITS_PER_UNIT;
   if (boundary < TYPE_ALIGN (type))
     {
-      type = build_variant_type_copy (type);
+      tree orig_type = type;
+      type = build_variant_type_copy (orig_type);
       TYPE_ALIGN (type) = boundary;
+      TYPE_CANONICAL (type) = TYPE_CANONICAL (orig_type);
     }
 
   /* Compute the rounded size of the type.  */
Index: c-common.c
===================================================================
--- c-common.c	(revision 119271)
+++ c-common.c	(working copy)
@@ -396,6 +396,18 @@ int flag_conserve_space;
 
 int flag_access_control = 1;
 
+/* Nonzero if we want to check the canonical types against structural
+   types. Useful for debugging errors due to incorrect canonical type
+   assignment. */
+
+int flag_check_canonical_types = 
+#ifdef ENABLE_CHECKING
+  1
+#else
+  0
+#endif
+  ;
+
 /* Nonzero if we want to check the return value of new and avoid calling
    constructors if it is a null pointer.  */
 
@@ -3379,6 +3391,16 @@ c_common_nodes_and_builtins (void)
 
   record_builtin_type (RID_VOID, NULL, void_type_node);
 
+  /* Set the TYPE_NAME for any variants that were built before
+     record_builtin_type gave names to the built-in types. */
+  {
+    tree void_name = TYPE_NAME (void_type_node);
+    TYPE_NAME (void_type_node) = NULL_TREE;
+    TYPE_NAME (build_qualified_type (void_type_node, TYPE_QUAL_CONST))
+      = void_name;
+    TYPE_NAME (void_type_node) = void_name;
+  }
+
   /* This node must not be shared.  */
   void_zero_node = make_node (INTEGER_CST);
   TREE_TYPE (void_zero_node) = void_type_node;
@@ -4237,7 +4259,13 @@ handle_packed_attribute (tree *node, tre
   if (TYPE_P (*node))
     {
       if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
-	*node = build_variant_type_copy (*node);
+	{
+	  tree orig_node = *node;
+	  *node = build_variant_type_copy (orig_node);
+	  TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node);
+	  TYPE_STRUCTURAL_EQUALITY (*node) = 
+	    TYPE_STRUCTURAL_EQUALITY (orig_node);
+	}
       TYPE_PACKED (*node) = 1;
     }
   else if (TREE_CODE (*node) == FIELD_DECL)
@@ -4463,7 +4491,13 @@ handle_unused_attribute (tree *node, tre
   else
     {
       if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
-	*node = build_variant_type_copy (*node);
+	{
+	  tree orig_node = *node;
+	  *node = build_variant_type_copy (orig_node);
+	  TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node);
+	  TYPE_STRUCTURAL_EQUALITY (*node) = 
+	    TYPE_STRUCTURAL_EQUALITY (orig_node);
+	}
       TREE_USED (*node) = 1;
     }
 
@@ -4891,10 +4925,17 @@ handle_aligned_attribute (tree *node, tr
 	  TYPE_NAME (*type) = decl;
 	  TREE_USED (*type) = TREE_USED (decl);
 	  TREE_TYPE (decl) = *type;
+	  TYPE_CANONICAL (*type) = TYPE_CANONICAL (tt);
+	  TYPE_STRUCTURAL_EQUALITY (*type) = TYPE_STRUCTURAL_EQUALITY (tt);
 	}
       else if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
-	*type = build_variant_type_copy (*type);
-
+	{
+	  tree orig_type = *type;
+	  *type = build_variant_type_copy (orig_type);
+	  TYPE_CANONICAL (*type) = TYPE_CANONICAL (orig_type);
+	  TYPE_STRUCTURAL_EQUALITY (*type) 
+	    = TYPE_STRUCTURAL_EQUALITY (orig_type);
+	}
       TYPE_ALIGN (*type) = (1 << i) * BITS_PER_UNIT;
       TYPE_USER_ALIGN (*type) = 1;
     }
@@ -5363,7 +5404,13 @@ handle_deprecated_attribute (tree *node,
   else if (TYPE_P (*node))
     {
       if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
-	*node = build_variant_type_copy (*node);
+	{
+	  tree orig_node = *node;
+	  *node = build_variant_type_copy (orig_node);
+	  TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node);
+	  TYPE_STRUCTURAL_EQUALITY (*node) = 
+	    TYPE_STRUCTURAL_EQUALITY (orig_node);
+	}
       TREE_DEPRECATED (*node) = 1;
       type = *node;
     }
@@ -6284,6 +6331,7 @@ complete_array_type (tree *ptype, tree i
 {
   tree maxindex, type, main_type, elt, unqual_elt;
   int failure = 0, quals;
+  hashval_t hashcode = 0;
 
   maxindex = size_zero_node;
   if (initial_value)
@@ -6360,6 +6408,12 @@ complete_array_type (tree *ptype, tree i
   TYPE_DOMAIN (main_type) = build_index_type (maxindex);
   layout_type (main_type);
 
+  /* Make sure we have the canonical MAIN_TYPE. */
+  hashcode = iterative_hash_object (TYPE_HASH (unqual_elt), hashcode);
+  hashcode = iterative_hash_object (TYPE_HASH (TYPE_DOMAIN (main_type)), 
+				    hashcode);
+  main_type = type_hash_canon (hashcode, main_type);
+
   if (quals == 0)
     type = main_type;
   else
Index: c-common.h
===================================================================
--- c-common.h	(revision 119271)
+++ c-common.h	(working copy)
@@ -519,6 +519,12 @@ extern int flag_conserve_space;
 
 extern int flag_access_control;
 
+/* Nonzero if we want to check the canonical types against structural
+   types. Useful for debugging errors due to incorrect canonical type
+   assignment. */
+
+extern int flag_check_canonical_types;
+
 /* Nonzero if we want to check the return value of new and avoid calling
    constructors if it is a null pointer.  */
 
Index: c.opt
===================================================================
--- c.opt	(revision 119271)
+++ c.opt	(working copy)
@@ -465,6 +465,10 @@ Recognize built-in functions
 fbuiltin-
 C ObjC C++ ObjC++ Joined
 
+fcheck-canonical-types
+C++
+Check GCC's canonical type system.
+
 fcheck-new
 C++ ObjC++
 Check the return value of new
Index: c-opts.c
===================================================================
--- c-opts.c	(revision 119271)
+++ c-opts.c	(working copy)
@@ -648,6 +648,10 @@ c_common_handle_option (size_t scode, co
       flag_signed_char = !value;
       break;
 
+    case OPT_fcheck_canonical_types:
+      flag_check_canonical_types = value;
+      break;
+
     case OPT_fcheck_new:
       flag_check_new = value;
       break;
Index: print-tree.c
===================================================================
--- print-tree.c	(revision 119271)
+++ print-tree.c	(working copy)
@@ -604,6 +604,11 @@ print_node (FILE *file, const char *pref
 	       TYPE_ALIGN (node), TYPE_SYMTAB_ADDRESS (node),
 	       TYPE_ALIAS_SET (node));
 
+      if (TYPE_STRUCTURAL_EQUALITY (node))
+	fprintf (file, " structural equality");
+      else
+	dump_addr (file, " canonical type ", TYPE_CANONICAL (node));
+      
       print_node (file, "attributes", TYPE_ATTRIBUTES (node), indent + 4);
 
       if (INTEGRAL_TYPE_P (node) || TREE_CODE (node) == REAL_TYPE)
Index: stor-layout.c
===================================================================
--- stor-layout.c	(revision 119271)
+++ stor-layout.c	(working copy)
@@ -1782,6 +1782,11 @@ layout_type (tree type)
 #else
 	TYPE_ALIGN (type) = MAX (TYPE_ALIGN (element), BITS_PER_UNIT);
 #endif
+	if (!TYPE_SIZE (element))
+	  /* We don't know the size of the underlying element type, so
+	     our alignment calculations will be wrong, forcing us to
+	     fall back on structural equality. */
+	  TYPE_STRUCTURAL_EQUALITY (type) = 1;
 	TYPE_USER_ALIGN (type) = TYPE_USER_ALIGN (element);
 	TYPE_MODE (type) = BLKmode;
 	if (TYPE_SIZE (type) != 0
Index: tree.c
===================================================================
--- tree.c	(revision 119271)
+++ tree.c	(working copy)
@@ -557,6 +557,7 @@ make_node_stat (enum tree_code code MEM_
       TYPE_ALIGN (t) = BITS_PER_UNIT;
       TYPE_USER_ALIGN (t) = 0;
       TYPE_MAIN_VARIANT (t) = t;
+      TYPE_CANONICAL (t) = t;
 
       /* Default to no attributes for type, but let target change that.  */
       TYPE_ATTRIBUTES (t) = NULL_TREE;
@@ -3383,6 +3384,8 @@ build_type_attribute_qual_variant (tree 
       TYPE_POINTER_TO (ntype) = 0;
       TYPE_REFERENCE_TO (ntype) = 0;
       TYPE_ATTRIBUTES (ntype) = attribute;
+      TYPE_CANONICAL (ntype) = 
+	build_qualified_type (TYPE_CANONICAL (ttype), quals);
 
       /* Create a new main variant of TYPE.  */
       TYPE_MAIN_VARIANT (ntype) = ntype;
@@ -3421,6 +3424,13 @@ build_type_attribute_qual_variant (tree 
 	}
 
       ntype = type_hash_canon (hashcode, ntype);
+
+      /* If the target-dependent attributes make NTYPE different from
+	 its canonical type, we will need to use structural equality
+	 checks for this qualified type. */
+      if (!targetm.comp_type_attributes (ntype, ttype))
+	TYPE_STRUCTURAL_EQUALITY (ntype) = 1;
+
       ttype = build_qualified_type (ntype, quals);
     }
 
@@ -3868,6 +3878,11 @@ build_qualified_type (tree type, int typ
     {
       t = build_variant_type_copy (type);
       set_type_quals (t, type_quals);
+
+      if (TYPE_CANONICAL (type) != type)
+	TYPE_CANONICAL (t) = build_qualified_type (TYPE_CANONICAL (type),
+						   type_quals);
+      
     }
 
   return t;
@@ -3883,6 +3898,7 @@ build_distinct_type_copy (tree type)
   
   TYPE_POINTER_TO (t) = 0;
   TYPE_REFERENCE_TO (t) = 0;
+  TYPE_CANONICAL (t) = t;
 
   /* Make it its own variant.  */
   TYPE_MAIN_VARIANT (t) = t;
@@ -5001,6 +5017,12 @@ build_pointer_type_for_mode (tree to_typ
   TYPE_NEXT_PTR_TO (t) = TYPE_POINTER_TO (to_type);
   TYPE_POINTER_TO (to_type) = t;
 
+  if (TYPE_CANONICAL (to_type) != to_type)
+    TYPE_CANONICAL (t) = 
+      build_pointer_type_for_mode (TYPE_CANONICAL (to_type),
+				   mode, can_alias_all);
+  TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type);
+
   /* Lay out the type.  This function has many callers that are concerned
      with expression-construction, and this simplifies them all.  */
   layout_type (t);
@@ -5050,6 +5072,12 @@ build_reference_type_for_mode (tree to_t
   TYPE_NEXT_REF_TO (t) = TYPE_REFERENCE_TO (to_type);
   TYPE_REFERENCE_TO (to_type) = t;
 
+  if (TYPE_CANONICAL (to_type) != to_type)
+    TYPE_CANONICAL (t) = 
+      build_reference_type_for_mode (TYPE_CANONICAL (to_type),
+				     mode, can_alias_all);
+  TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type);
+
   layout_type (t);
 
   return t;
@@ -5116,7 +5144,12 @@ build_index_type (tree maxval)
   if (host_integerp (maxval, 1))
     return type_hash_canon (tree_low_cst (maxval, 1), itype);
   else
-    return itype;
+    {
+      /* Since we cannot hash this type, we need to compare it using
+	 structural equality checks. */
+      TYPE_STRUCTURAL_EQUALITY (itype) = 1;
+      return itype;
+    }
 }
 
 /* Builds a signed or unsigned integer type of precision PRECISION.
@@ -5208,6 +5241,18 @@ build_array_type (tree elt_type, tree in
       t = type_hash_canon (hashcode, t);
       if (save == t)
 	layout_type (t);
+
+      if (TYPE_CANONICAL (t) == t)
+	{
+	  if (TYPE_CANONICAL (elt_type) != elt_type)
+	    {
+	      TYPE_CANONICAL (t) = 
+		build_array_type (TYPE_CANONICAL (elt_type), index_type);
+	    }
+	  TYPE_STRUCTURAL_EQUALITY (t) = 
+	    TYPE_STRUCTURAL_EQUALITY (elt_type);
+	}
+
       return t;
     }
 
@@ -5217,6 +5262,21 @@ build_array_type (tree elt_type, tree in
 
   if (!COMPLETE_TYPE_P (t))
     layout_type (t);
+
+  if (TYPE_CANONICAL (t) == t)
+    {
+      if (TYPE_CANONICAL (elt_type) != elt_type
+	  || TYPE_CANONICAL (index_type) != index_type)
+	{
+	  TYPE_CANONICAL (t) = 
+	    build_array_type (TYPE_CANONICAL (elt_type),
+			      TYPE_CANONICAL (index_type));
+	}
+      TYPE_STRUCTURAL_EQUALITY (t) = 
+	TYPE_STRUCTURAL_EQUALITY (elt_type) 
+	| TYPE_STRUCTURAL_EQUALITY (index_type);
+    }
+
   return t;
 }
 
@@ -5257,6 +5317,7 @@ build_function_type (tree value_type, tr
   t = make_node (FUNCTION_TYPE);
   TREE_TYPE (t) = value_type;
   TYPE_ARG_TYPES (t) = arg_types;
+  TYPE_STRUCTURAL_EQUALITY (t) = 1;
 
   /* If we already have such a type, use the old one.  */
   hashcode = iterative_hash_object (TYPE_HASH (value_type), hashcode);
@@ -5324,6 +5385,7 @@ build_method_type_directly (tree basetyp
      which is "this".  Put it into the list of argument types.  */
   argtypes = tree_cons (NULL_TREE, ptype, argtypes);
   TYPE_ARG_TYPES (t) = argtypes;
+  TYPE_STRUCTURAL_EQUALITY (t) = 1;
 
   /* If we already have such a type, use the old one.  */
   hashcode = iterative_hash_object (TYPE_HASH (basetype), hashcode);
@@ -5376,6 +5438,20 @@ build_offset_type (tree basetype, tree t
   if (!COMPLETE_TYPE_P (t))
     layout_type (t);
 
+  if (TYPE_CANONICAL (t) == t)
+    {
+      if (TYPE_CANONICAL (basetype) != basetype
+	  || TYPE_CANONICAL (type) != type)
+	{
+	  TYPE_CANONICAL (t) = 
+	    build_offset_type (TYPE_CANONICAL (basetype), 
+			       TYPE_CANONICAL (type));
+	}
+      TYPE_STRUCTURAL_EQUALITY (t) = 
+	TYPE_STRUCTURAL_EQUALITY (basetype) 
+	| TYPE_STRUCTURAL_EQUALITY (type);
+    }
+
   return t;
 }
 
@@ -5399,6 +5475,15 @@ build_complex_type (tree component_type)
   if (!COMPLETE_TYPE_P (t))
     layout_type (t);
 
+  if (TYPE_CANONICAL (t) == t)
+    {
+      if (TYPE_CANONICAL (component_type) != component_type)
+	TYPE_CANONICAL (t) 
+	  = build_complex_type (TYPE_CANONICAL (component_type));
+      TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (component_type);
+
+    }
+
   /* If we are writing Dwarf2 output we need to create a name,
      since complex is a fundamental type.  */
   if ((write_symbols == DWARF2_DEBUG || write_symbols == VMS_AND_DWARF2_DEBUG)
@@ -6417,6 +6502,12 @@ make_vector_type (tree innertype, int nu
   TYPE_READONLY (t) = TYPE_READONLY (innertype);
   TYPE_VOLATILE (t) = TYPE_VOLATILE (innertype);
 
+  if (TYPE_CANONICAL (innertype) != innertype
+      || mode != VOIDmode)
+    TYPE_CANONICAL (t) = make_vector_type (TYPE_CANONICAL (innertype),
+					   nunits, VOIDmode);
+  TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (innertype);
+
   layout_type (t);
 
   {
@@ -6629,7 +6720,12 @@ build_common_tree_nodes_2 (int short_dou
        don't copy record types and let c_common_nodes_and_builtins()
        declare the type to be __builtin_va_list.  */
     if (TREE_CODE (t) != RECORD_TYPE)
-      t = build_variant_type_copy (t);
+      {
+	tree orig_t = t;
+	t = build_variant_type_copy (orig_t);
+	TYPE_CANONICAL (t) = TYPE_CANONICAL (orig_t);
+	TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (orig_t);
+      }
 
     va_list_type_node = t;
   }
Index: tree.h
===================================================================
--- tree.h	(revision 119271)
+++ tree.h	(working copy)
@@ -1927,6 +1927,9 @@ struct tree_block GTY(())
 #define TYPE_NEXT_VARIANT(NODE) (TYPE_CHECK (NODE)->type.next_variant)
 #define TYPE_MAIN_VARIANT(NODE) (TYPE_CHECK (NODE)->type.main_variant)
 #define TYPE_CONTEXT(NODE) (TYPE_CHECK (NODE)->type.context)
+#define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type.canonical)
+#define TYPE_STRUCTURAL_EQUALITY(NODE) \
+  (TYPE_CHECK (NODE)->type.structural_equality)
 #define TYPE_LANG_SPECIFIC(NODE) (TYPE_CHECK (NODE)->type.lang_specific)
 
 /* For a VECTOR_TYPE node, this describes a different type which is emitted
@@ -2111,6 +2114,9 @@ struct tree_type GTY(())
   unsigned lang_flag_6 : 1;
   unsigned user_align : 1;
 
+  unsigned structural_equality : 1;
+  unsigned unused_bits : 7;
+
   unsigned int align;
   tree pointer_to;
   tree reference_to;
@@ -2127,6 +2133,7 @@ struct tree_type GTY(())
   tree main_variant;
   tree binfo;
   tree context;
+  tree canonical;
   HOST_WIDE_INT alias_set;
   /* Points to a structure whose details depend on the language in use.  */
   struct lang_type *lang_specific;

Reply via email to