On Fri, May 23, 2025 at 11:31:26AM -0400, Jason Merrill wrote: > On 5/21/25 10:15 PM, Nathaniel Shead wrote: > > Another approach would be to fix 'write_class_def' to handle these > > declarations better, but that ended up being more work and felt fragile. > > It also meant streaming a lot more information that we don't need. > > > > Long term I had been playing around with reworking ubsan.cc entirely to > > have a fixed set of types it would use we that we could merge with, but > > given that there seems to be at least one other place we are creating > > ad-hoc types (the struct for constexpr new allocations), and I couldn't > > see an easy way of reworking that, I thought we should support this. > > > > Finally, I'm not 100% certain about the hard-coding MK_unique for fields > > of contextless types, but given that we've given up merging the owning > > TYPE_DECL with anything anyway I think it should be OK. > > > > -- >8 -- > > > > Currently, most declarations must have a DECL_CONTEXT for modules > > streaming to behave correctly, so that they can have an appropriate > > merge key generated and be correctly deduplicated on import. > > > > There are a few exceptions, however, for internally generated > > declarations that will never be merged and don't necessarily have an > > appropriate parent to key off for the context. One case that's come up > > a few times is TYPE_DECLs, especially temporary RECORD_TYPEs used as > > intermediaries within expressions. > > > > Previously I've tried to give all such types a DECL_CONTEXT, but in some > > cases that has ended up being infeasible, such as with the types > > generated by UBSan (which are shared with the C frontend and don't know > > their context, especially when created at global scope). Additionally, > > these types often don't have many of the parts that a normal struct > > declaration created via parsing user code would have, which confuses > > module streaming. > > > > Given that these types are typically intended to be one-off and unique > > anyway, this patch instead adds support for by-value streaming of > > uncontexted TYPE_DECLs. The patch only support streaming the bare > > minimum amount of fields needed for the cases I've come across so far; > > in general the preference should still be to ensure that DECL_CONTEXT is > > set where possible. > > We should be able to distinguish such types by CLASS_TYPE_P, which is false > for them. > > Jason >
Right, thanks; here's an updated version that adds this check to 'trees_out::tree_node'. I think leaving the other asserts is still valuable to catch if we ever have cases that start using these bits in the future. Bootstraped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- Currently, most declarations must have a DECL_CONTEXT for modules streaming to behave correctly, so that they can have an appropriate merge key generated and be correctly deduplicated on import. There are a few exceptions, however, for internally generated declarations that will never be merged and don't necessarily have an appropriate parent to key off for the context. One case that's come up a few times is TYPE_DECLs, especially temporary RECORD_TYPEs used as intermediaries within expressions. Previously I've tried to give all such types a DECL_CONTEXT, but in some cases that has ended up being infeasible, such as with the types generated by UBSan (which are shared with the C frontend and don't know their context, especially when created at global scope). Additionally, these types often don't have many of the parts that a normal struct declaration created via parsing user code would have, which confuses module streaming. Given that these types are typically intended to be one-off and unique anyway, this patch instead adds support for by-value streaming of uncontexted TYPE_DECLs. The patch only support streaming the bare minimum amount of fields needed for the cases I've come across so far; in general the preference should still be to ensure that DECL_CONTEXT is set where possible. PR c++/98735 PR c++/120040 gcc/cp/ChangeLog: * module.cc (trees_out::tree_value): Write TYPE_DECLs. (trees_in::tree_value): Read TYPE_DECLs. (trees_out::tree_node): Support uncontexted TYPE_DECLs, and ensure that all parts of a by-value decl are marked for streaming. (trees_out::get_merge_kind): Treat members of uncontexted types as always unique. gcc/testsuite/ChangeLog: * g++.dg/modules/pr120040_a.C: New test. * g++.dg/modules/pr120040_b.C: New test. Signed-off-by: Nathaniel Shead <nathanielosh...@gmail.com> --- gcc/cp/module.cc | 104 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 99 insertions(+), 5 deletions(-) diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc index f267c3e5fda..d587835dd4f 100644 --- a/gcc/cp/module.cc +++ b/gcc/cp/module.cc @@ -9654,9 +9654,10 @@ trees_out::tree_value (tree t) if (DECL_P (t)) /* No template, type, var or function, except anonymous - non-context vars. */ + non-context vars and types. */ gcc_checking_assert ((TREE_CODE (t) != TEMPLATE_DECL - && TREE_CODE (t) != TYPE_DECL + && (TREE_CODE (t) != TYPE_DECL + || (DECL_ARTIFICIAL (t) && !DECL_CONTEXT (t))) && (TREE_CODE (t) != VAR_DECL || (!DECL_NAME (t) && !DECL_CONTEXT (t))) && TREE_CODE (t) != FUNCTION_DECL)); @@ -9670,7 +9671,7 @@ trees_out::tree_value (tree t) tree_node_bools (t); } - if (TREE_CODE (t) == TREE_BINFO) + if (TREE_CODE (t) == TREE_BINFO) /* Binfos are decl-like and need merging information. */ binfo_mergeable (t); @@ -9679,8 +9680,51 @@ trees_out::tree_value (tree t) dump (dumper::TREE) && dump ("Writing tree:%d %C:%N", tag, TREE_CODE (t), t); + int type_tag = 0; + tree type = NULL_TREE; + if (TREE_CODE (t) == TYPE_DECL) + { + type = TREE_TYPE (t); + + /* We only support a limited set of features for uncontexted types; + these are typically types created in the language-independent + parts of the frontend (such as ubsan). */ + gcc_checking_assert (RECORD_OR_UNION_TYPE_P (type) + && TYPE_MAIN_VARIANT (type) == type + && TYPE_NAME (type) == t + && TYPE_STUB_DECL (type) == t + && !TYPE_VFIELD (type) + && !TYPE_BINFO (type)); + gcc_checking_assert (!TYPE_LANG_SPECIFIC (type) + || (!TYPE_CONTAINS_VPTR_P (type) + && !CLASSTYPE_MEMBER_VEC (type))); + + if (streaming_p ()) + { + start (type); + tree_node_bools (type); + } + + type_tag = insert (type, WK_value); + if (streaming_p ()) + dump (dumper::TREE) + && dump ("Writing type: %d %C:%N", type_tag, + TREE_CODE (type), type); + } + tree_node_vals (t); + if (type) + { + tree_node_vals (type); + tree_node (TYPE_SIZE (type)); + tree_node (TYPE_SIZE_UNIT (type)); + chained_decls (TYPE_FIELDS (type)); + if (streaming_p ()) + dump (dumper::TREE) + && dump ("Written type:%d %C:%N", type_tag, TREE_CODE (type), type); + } + if (streaming_p ()) dump (dumper::TREE) && dump ("Written tree:%d %C:%N", tag, TREE_CODE (t), t); } @@ -9719,14 +9763,48 @@ trees_in::tree_value () dump (dumper::TREE) && dump ("Reading tree:%d %C", tag, TREE_CODE (t)); - if (!tree_node_vals (t)) + int type_tag = 0; + tree type = NULL_TREE; + if (TREE_CODE (t) == TYPE_DECL) + { + type = start (); + if (!type || !tree_node_bools (type)) + t = NULL_TREE; + + type_tag = insert (type); + if (t) + dump (dumper::TREE) + && dump ("Reading type:%d %C", type_tag, TREE_CODE (type)); + } + + if (!t) { +bail: back_refs[~tag] = NULL_TREE; + if (type_tag) + back_refs[~type_tag] = NULL_TREE; set_overrun (); - /* Bail. */ return NULL_TREE; } + if (!tree_node_vals (t)) + goto bail; + + if (type) + { + if (!tree_node_vals (type)) + goto bail; + + TYPE_SIZE (type) = tree_node (); + TYPE_SIZE_UNIT (type) = tree_node (); + TYPE_FIELDS (type) = chained_decls (); + if (get_overrun ()) + goto bail; + + dump (dumper::TREE) + && dump ("Read type:%d %C:%N", type_tag, TREE_CODE (type), type); + } + if (TREE_CODE (t) == LAMBDA_EXPR && CLASSTYPE_LAMBDA_EXPR (TREE_TYPE (t))) { @@ -9980,7 +10058,18 @@ trees_out::tree_node (tree t) PARM_DECLS. It'd be nice if they had a distinguishing flag to double check. */ break; + + case TYPE_DECL: + /* Some parts of the compiler need internal struct types; + these types may not have an appropriate context to use. + Walk the whole type (including its definition) by value. */ + gcc_checking_assert (DECL_ARTIFICIAL (t) + && TYPE_ARTIFICIAL (TREE_TYPE (t)) + && RECORD_OR_UNION_TYPE_P (TREE_TYPE (t)) + && !CLASS_TYPE_P (TREE_TYPE (t))); + break; } + mark_declaration (t, has_definition (t)); goto by_value; } } @@ -11120,6 +11209,11 @@ trees_out::get_merge_kind (tree decl, depset *dep) return MK_local_friend; gcc_checking_assert (TYPE_P (ctx)); + + /* Internal-only types will not need to dedup their members. */ + if (!DECL_CONTEXT (TYPE_NAME (ctx))) + return MK_unique; + if (TREE_CODE (decl) == USING_DECL) return MK_field; -- 2.47.0