This patch introduces canonical types into GCC, which allow us to compare two types very efficiently and results in an overall compile-time performance improvement. I have been seeing 3-5% improvements in compile time on the G++ and libstdc++ test suites, 5-10% on template-heavy (but realistic) code in Boost, and up to 85% improvement for extremely template-heavy metaprogramming.
Canonical types also make the GCC type system more regular. In GCC, we often create many different variants of the same type that are all considered equivalent (e.g., comptypes returns true for both). For instance, "int" and a typedef of "int" will have separate INTEGER_TYPE nodes, but these will compare equal when we're performing type checking. Unfortunately, we need to perform a deep comparison, checking cv-qualifiers, attributes, precision, alignment, etc. Canonical types add a new field to every type which points to a representative of the equivalent class of types: all types in GCC that are equivalent will point to the same canonical type, so we can use the canonical types to quickly check type equivalence. Canonical types were discussed in this thread: http://gcc.gnu.org/ml/gcc/2006-11/msg00192.html This patch adds two new fields to every _TYPE, and maintains those fields throughout the C, C++, Objective-C and Objective-C++ front ends. The first field is TYPE_CANONICAL, which is the canonical type of each type. The second field is TYPE_STRUCTURAL_EQUALITY: when this bit is set, it means that TYPE_CANONICAL cannot be used reliably for type equality checking, so we must perform structural tests. We use TYPE_STRUCTURAL_EQUALITY when we create a new type node for which we can't find the canonical type. Over time, we should work to eliminate TYPE_STRUCTURAL_EQUALITY, because each case that requires it also means that we're wasting memory for duplicated type nodes and wasting cycles performing deep comparisons. There is some danger in introducing canonical types: if we fail to canonicalize properly, types won't compare equal when they should, and we'll break code. To mitigate this problem, I have added a flags -f(no-)check-canonical-types. When enabled, GCC will check the canonical types against the structural types; where they differ, it will print a warning and use the structural information. When disabled, we only use canonical types... and get better compile-time performance. Canonical type checking is disabled by default for normal builds, enabled by default when --enable-checking is present. Note, however, that this flag is temporary: after a major release or two, when we're sure that we have our canonical type system, we can eliminate it. Bootstrapped C, C++, Objective-C, Objective-C++, Java on i686-pc-linux-gnu. Tested C, C++, Objective-C, Objective-C++, and libstdc++ test suites; no new regressions. I've spot-checked some template-heavy Boost libraries; no regressions there, either. This is part one of three, containing changes to the parts of the compiler that are shared amount the C family of languages. Okay for mainline? Cheers, Doug Gregor Open Systems Lab @ Indiana University 2006-11-28 Douglas Gregor <[EMAIL PROTECTED]> * builtins.c (std_gimplify_va_arg_expr): Keep track of the canonical type when building a variant with a different alignment. * c-common.c (flag_check_canonical_types): New. (c_common_nodes_and_builtins): Since variants of void_type_node get built before it is given a name, we need to give those variants the name, too. (handle_packed_attribute): When building the type variant, set the canonical type appropriately. (handle_unused_attribute): When building the type variant, set the canonical type appropriately. (handle_aligned_attribute): Ditto. (handle_deprecated_attribute): Ditto. (complete_array_type): We need to work with the canonical main type of the array, from which we will build the qualified version. * c-common.h (flag_check_canonical_types): New. * c.opt (fcheck-canonical-types): New. * c-opts.c (c_common_handle_option): Handle -f(no-)check-canonical-types. * print-tree.c (print_node): Display canonical type information for each type. * stor-layout.c (layout_type): When we don't know the alignment of a type for which we're building an array, we end up guessing wrong, so make the type require structural equality. * tree.c (make_node_stat): When we build a new type, it is its own canonical type. (build_type_attribute_qual_variant): When building an attribute variant, its canonical type is the non-attribute variant. However, if the attributes are target-dependent and they differ, we need to use structural equality checks for this type. (build_qualified_type): Ditto. (build_distinct_type_copy): When building a distinct type from another type, the new type is its own canonical type. (build_pointer_type_for_mode): When building a pointer type, also build a canonical type pointer. (build_reference_type_for_mode): When building a reference type, also build a canonical type reference. (build_index_type): When we can't hash an index type (e.g., because its maximum value is negative), the index type requires structural equality tests. (build_array_type): Build the canonical form of an array type. (build_function_type): Function types require structural equality, because they contain default arguments, attributes, etc. (build_method_type_directly): Ditto for method types. (build_offset_type): Build the canonical offset type. (build_complex_type): Build the canonical vector type. (make_vector_type): Build the canonical vector type. (build_common_tree_nodes_2): Set the canonical type of va_list_type_node appropriately. * tree.h (TYPE_CANONICAL): New. (TYPE_STRUCTURAL_EQUALITY): New. (struct tree_type): Added structural_equality, unused_bits, canonical fields.
Index: builtins.c =================================================================== --- builtins.c (revision 119271) +++ builtins.c (working copy) @@ -4407,8 +4407,10 @@ std_gimplify_va_arg_expr (tree valist, t boundary *= BITS_PER_UNIT; if (boundary < TYPE_ALIGN (type)) { - type = build_variant_type_copy (type); + tree orig_type = type; + type = build_variant_type_copy (orig_type); TYPE_ALIGN (type) = boundary; + TYPE_CANONICAL (type) = TYPE_CANONICAL (orig_type); } /* Compute the rounded size of the type. */ Index: c-common.c =================================================================== --- c-common.c (revision 119271) +++ c-common.c (working copy) @@ -396,6 +396,18 @@ int flag_conserve_space; int flag_access_control = 1; +/* Nonzero if we want to check the canonical types against structural + types. Useful for debugging errors due to incorrect canonical type + assignment. */ + +int flag_check_canonical_types = +#ifdef ENABLE_CHECKING + 1 +#else + 0 +#endif + ; + /* Nonzero if we want to check the return value of new and avoid calling constructors if it is a null pointer. */ @@ -3379,6 +3391,16 @@ c_common_nodes_and_builtins (void) record_builtin_type (RID_VOID, NULL, void_type_node); + /* Set the TYPE_NAME for any variants that were built before + record_builtin_type gave names to the built-in types. */ + { + tree void_name = TYPE_NAME (void_type_node); + TYPE_NAME (void_type_node) = NULL_TREE; + TYPE_NAME (build_qualified_type (void_type_node, TYPE_QUAL_CONST)) + = void_name; + TYPE_NAME (void_type_node) = void_name; + } + /* This node must not be shared. */ void_zero_node = make_node (INTEGER_CST); TREE_TYPE (void_zero_node) = void_type_node; @@ -4237,7 +4259,13 @@ handle_packed_attribute (tree *node, tre if (TYPE_P (*node)) { if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) - *node = build_variant_type_copy (*node); + { + tree orig_node = *node; + *node = build_variant_type_copy (orig_node); + TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node); + TYPE_STRUCTURAL_EQUALITY (*node) = + TYPE_STRUCTURAL_EQUALITY (orig_node); + } TYPE_PACKED (*node) = 1; } else if (TREE_CODE (*node) == FIELD_DECL) @@ -4463,7 +4491,13 @@ handle_unused_attribute (tree *node, tre else { if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) - *node = build_variant_type_copy (*node); + { + tree orig_node = *node; + *node = build_variant_type_copy (orig_node); + TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node); + TYPE_STRUCTURAL_EQUALITY (*node) = + TYPE_STRUCTURAL_EQUALITY (orig_node); + } TREE_USED (*node) = 1; } @@ -4891,10 +4925,17 @@ handle_aligned_attribute (tree *node, tr TYPE_NAME (*type) = decl; TREE_USED (*type) = TREE_USED (decl); TREE_TYPE (decl) = *type; + TYPE_CANONICAL (*type) = TYPE_CANONICAL (tt); + TYPE_STRUCTURAL_EQUALITY (*type) = TYPE_STRUCTURAL_EQUALITY (tt); } else if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) - *type = build_variant_type_copy (*type); - + { + tree orig_type = *type; + *type = build_variant_type_copy (orig_type); + TYPE_CANONICAL (*type) = TYPE_CANONICAL (orig_type); + TYPE_STRUCTURAL_EQUALITY (*type) + = TYPE_STRUCTURAL_EQUALITY (orig_type); + } TYPE_ALIGN (*type) = (1 << i) * BITS_PER_UNIT; TYPE_USER_ALIGN (*type) = 1; } @@ -5363,7 +5404,13 @@ handle_deprecated_attribute (tree *node, else if (TYPE_P (*node)) { if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) - *node = build_variant_type_copy (*node); + { + tree orig_node = *node; + *node = build_variant_type_copy (orig_node); + TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig_node); + TYPE_STRUCTURAL_EQUALITY (*node) = + TYPE_STRUCTURAL_EQUALITY (orig_node); + } TREE_DEPRECATED (*node) = 1; type = *node; } @@ -6284,6 +6331,7 @@ complete_array_type (tree *ptype, tree i { tree maxindex, type, main_type, elt, unqual_elt; int failure = 0, quals; + hashval_t hashcode = 0; maxindex = size_zero_node; if (initial_value) @@ -6360,6 +6408,12 @@ complete_array_type (tree *ptype, tree i TYPE_DOMAIN (main_type) = build_index_type (maxindex); layout_type (main_type); + /* Make sure we have the canonical MAIN_TYPE. */ + hashcode = iterative_hash_object (TYPE_HASH (unqual_elt), hashcode); + hashcode = iterative_hash_object (TYPE_HASH (TYPE_DOMAIN (main_type)), + hashcode); + main_type = type_hash_canon (hashcode, main_type); + if (quals == 0) type = main_type; else Index: c-common.h =================================================================== --- c-common.h (revision 119271) +++ c-common.h (working copy) @@ -519,6 +519,12 @@ extern int flag_conserve_space; extern int flag_access_control; +/* Nonzero if we want to check the canonical types against structural + types. Useful for debugging errors due to incorrect canonical type + assignment. */ + +extern int flag_check_canonical_types; + /* Nonzero if we want to check the return value of new and avoid calling constructors if it is a null pointer. */ Index: c.opt =================================================================== --- c.opt (revision 119271) +++ c.opt (working copy) @@ -465,6 +465,10 @@ Recognize built-in functions fbuiltin- C ObjC C++ ObjC++ Joined +fcheck-canonical-types +C++ +Check GCC's canonical type system. + fcheck-new C++ ObjC++ Check the return value of new Index: c-opts.c =================================================================== --- c-opts.c (revision 119271) +++ c-opts.c (working copy) @@ -648,6 +648,10 @@ c_common_handle_option (size_t scode, co flag_signed_char = !value; break; + case OPT_fcheck_canonical_types: + flag_check_canonical_types = value; + break; + case OPT_fcheck_new: flag_check_new = value; break; Index: print-tree.c =================================================================== --- print-tree.c (revision 119271) +++ print-tree.c (working copy) @@ -604,6 +604,11 @@ print_node (FILE *file, const char *pref TYPE_ALIGN (node), TYPE_SYMTAB_ADDRESS (node), TYPE_ALIAS_SET (node)); + if (TYPE_STRUCTURAL_EQUALITY (node)) + fprintf (file, " structural equality"); + else + dump_addr (file, " canonical type ", TYPE_CANONICAL (node)); + print_node (file, "attributes", TYPE_ATTRIBUTES (node), indent + 4); if (INTEGRAL_TYPE_P (node) || TREE_CODE (node) == REAL_TYPE) Index: stor-layout.c =================================================================== --- stor-layout.c (revision 119271) +++ stor-layout.c (working copy) @@ -1782,6 +1782,11 @@ layout_type (tree type) #else TYPE_ALIGN (type) = MAX (TYPE_ALIGN (element), BITS_PER_UNIT); #endif + if (!TYPE_SIZE (element)) + /* We don't know the size of the underlying element type, so + our alignment calculations will be wrong, forcing us to + fall back on structural equality. */ + TYPE_STRUCTURAL_EQUALITY (type) = 1; TYPE_USER_ALIGN (type) = TYPE_USER_ALIGN (element); TYPE_MODE (type) = BLKmode; if (TYPE_SIZE (type) != 0 Index: tree.c =================================================================== --- tree.c (revision 119271) +++ tree.c (working copy) @@ -557,6 +557,7 @@ make_node_stat (enum tree_code code MEM_ TYPE_ALIGN (t) = BITS_PER_UNIT; TYPE_USER_ALIGN (t) = 0; TYPE_MAIN_VARIANT (t) = t; + TYPE_CANONICAL (t) = t; /* Default to no attributes for type, but let target change that. */ TYPE_ATTRIBUTES (t) = NULL_TREE; @@ -3383,6 +3384,8 @@ build_type_attribute_qual_variant (tree TYPE_POINTER_TO (ntype) = 0; TYPE_REFERENCE_TO (ntype) = 0; TYPE_ATTRIBUTES (ntype) = attribute; + TYPE_CANONICAL (ntype) = + build_qualified_type (TYPE_CANONICAL (ttype), quals); /* Create a new main variant of TYPE. */ TYPE_MAIN_VARIANT (ntype) = ntype; @@ -3421,6 +3424,13 @@ build_type_attribute_qual_variant (tree } ntype = type_hash_canon (hashcode, ntype); + + /* If the target-dependent attributes make NTYPE different from + its canonical type, we will need to use structural equality + checks for this qualified type. */ + if (!targetm.comp_type_attributes (ntype, ttype)) + TYPE_STRUCTURAL_EQUALITY (ntype) = 1; + ttype = build_qualified_type (ntype, quals); } @@ -3868,6 +3878,11 @@ build_qualified_type (tree type, int typ { t = build_variant_type_copy (type); set_type_quals (t, type_quals); + + if (TYPE_CANONICAL (type) != type) + TYPE_CANONICAL (t) = build_qualified_type (TYPE_CANONICAL (type), + type_quals); + } return t; @@ -3883,6 +3898,7 @@ build_distinct_type_copy (tree type) TYPE_POINTER_TO (t) = 0; TYPE_REFERENCE_TO (t) = 0; + TYPE_CANONICAL (t) = t; /* Make it its own variant. */ TYPE_MAIN_VARIANT (t) = t; @@ -5001,6 +5017,12 @@ build_pointer_type_for_mode (tree to_typ TYPE_NEXT_PTR_TO (t) = TYPE_POINTER_TO (to_type); TYPE_POINTER_TO (to_type) = t; + if (TYPE_CANONICAL (to_type) != to_type) + TYPE_CANONICAL (t) = + build_pointer_type_for_mode (TYPE_CANONICAL (to_type), + mode, can_alias_all); + TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type); + /* Lay out the type. This function has many callers that are concerned with expression-construction, and this simplifies them all. */ layout_type (t); @@ -5050,6 +5072,12 @@ build_reference_type_for_mode (tree to_t TYPE_NEXT_REF_TO (t) = TYPE_REFERENCE_TO (to_type); TYPE_REFERENCE_TO (to_type) = t; + if (TYPE_CANONICAL (to_type) != to_type) + TYPE_CANONICAL (t) = + build_reference_type_for_mode (TYPE_CANONICAL (to_type), + mode, can_alias_all); + TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type); + layout_type (t); return t; @@ -5116,7 +5144,12 @@ build_index_type (tree maxval) if (host_integerp (maxval, 1)) return type_hash_canon (tree_low_cst (maxval, 1), itype); else - return itype; + { + /* Since we cannot hash this type, we need to compare it using + structural equality checks. */ + TYPE_STRUCTURAL_EQUALITY (itype) = 1; + return itype; + } } /* Builds a signed or unsigned integer type of precision PRECISION. @@ -5208,6 +5241,18 @@ build_array_type (tree elt_type, tree in t = type_hash_canon (hashcode, t); if (save == t) layout_type (t); + + if (TYPE_CANONICAL (t) == t) + { + if (TYPE_CANONICAL (elt_type) != elt_type) + { + TYPE_CANONICAL (t) = + build_array_type (TYPE_CANONICAL (elt_type), index_type); + } + TYPE_STRUCTURAL_EQUALITY (t) = + TYPE_STRUCTURAL_EQUALITY (elt_type); + } + return t; } @@ -5217,6 +5262,21 @@ build_array_type (tree elt_type, tree in if (!COMPLETE_TYPE_P (t)) layout_type (t); + + if (TYPE_CANONICAL (t) == t) + { + if (TYPE_CANONICAL (elt_type) != elt_type + || TYPE_CANONICAL (index_type) != index_type) + { + TYPE_CANONICAL (t) = + build_array_type (TYPE_CANONICAL (elt_type), + TYPE_CANONICAL (index_type)); + } + TYPE_STRUCTURAL_EQUALITY (t) = + TYPE_STRUCTURAL_EQUALITY (elt_type) + | TYPE_STRUCTURAL_EQUALITY (index_type); + } + return t; } @@ -5257,6 +5317,7 @@ build_function_type (tree value_type, tr t = make_node (FUNCTION_TYPE); TREE_TYPE (t) = value_type; TYPE_ARG_TYPES (t) = arg_types; + TYPE_STRUCTURAL_EQUALITY (t) = 1; /* If we already have such a type, use the old one. */ hashcode = iterative_hash_object (TYPE_HASH (value_type), hashcode); @@ -5324,6 +5385,7 @@ build_method_type_directly (tree basetyp which is "this". Put it into the list of argument types. */ argtypes = tree_cons (NULL_TREE, ptype, argtypes); TYPE_ARG_TYPES (t) = argtypes; + TYPE_STRUCTURAL_EQUALITY (t) = 1; /* If we already have such a type, use the old one. */ hashcode = iterative_hash_object (TYPE_HASH (basetype), hashcode); @@ -5376,6 +5438,20 @@ build_offset_type (tree basetype, tree t if (!COMPLETE_TYPE_P (t)) layout_type (t); + if (TYPE_CANONICAL (t) == t) + { + if (TYPE_CANONICAL (basetype) != basetype + || TYPE_CANONICAL (type) != type) + { + TYPE_CANONICAL (t) = + build_offset_type (TYPE_CANONICAL (basetype), + TYPE_CANONICAL (type)); + } + TYPE_STRUCTURAL_EQUALITY (t) = + TYPE_STRUCTURAL_EQUALITY (basetype) + | TYPE_STRUCTURAL_EQUALITY (type); + } + return t; } @@ -5399,6 +5475,15 @@ build_complex_type (tree component_type) if (!COMPLETE_TYPE_P (t)) layout_type (t); + if (TYPE_CANONICAL (t) == t) + { + if (TYPE_CANONICAL (component_type) != component_type) + TYPE_CANONICAL (t) + = build_complex_type (TYPE_CANONICAL (component_type)); + TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (component_type); + + } + /* If we are writing Dwarf2 output we need to create a name, since complex is a fundamental type. */ if ((write_symbols == DWARF2_DEBUG || write_symbols == VMS_AND_DWARF2_DEBUG) @@ -6417,6 +6502,12 @@ make_vector_type (tree innertype, int nu TYPE_READONLY (t) = TYPE_READONLY (innertype); TYPE_VOLATILE (t) = TYPE_VOLATILE (innertype); + if (TYPE_CANONICAL (innertype) != innertype + || mode != VOIDmode) + TYPE_CANONICAL (t) = make_vector_type (TYPE_CANONICAL (innertype), + nunits, VOIDmode); + TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (innertype); + layout_type (t); { @@ -6629,7 +6720,12 @@ build_common_tree_nodes_2 (int short_dou don't copy record types and let c_common_nodes_and_builtins() declare the type to be __builtin_va_list. */ if (TREE_CODE (t) != RECORD_TYPE) - t = build_variant_type_copy (t); + { + tree orig_t = t; + t = build_variant_type_copy (orig_t); + TYPE_CANONICAL (t) = TYPE_CANONICAL (orig_t); + TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (orig_t); + } va_list_type_node = t; } Index: tree.h =================================================================== --- tree.h (revision 119271) +++ tree.h (working copy) @@ -1927,6 +1927,9 @@ struct tree_block GTY(()) #define TYPE_NEXT_VARIANT(NODE) (TYPE_CHECK (NODE)->type.next_variant) #define TYPE_MAIN_VARIANT(NODE) (TYPE_CHECK (NODE)->type.main_variant) #define TYPE_CONTEXT(NODE) (TYPE_CHECK (NODE)->type.context) +#define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type.canonical) +#define TYPE_STRUCTURAL_EQUALITY(NODE) \ + (TYPE_CHECK (NODE)->type.structural_equality) #define TYPE_LANG_SPECIFIC(NODE) (TYPE_CHECK (NODE)->type.lang_specific) /* For a VECTOR_TYPE node, this describes a different type which is emitted @@ -2111,6 +2114,9 @@ struct tree_type GTY(()) unsigned lang_flag_6 : 1; unsigned user_align : 1; + unsigned structural_equality : 1; + unsigned unused_bits : 7; + unsigned int align; tree pointer_to; tree reference_to; @@ -2127,6 +2133,7 @@ struct tree_type GTY(()) tree main_variant; tree binfo; tree context; + tree canonical; HOST_WIDE_INT alias_set; /* Points to a structure whose details depend on the language in use. */ struct lang_type *lang_specific;