On Mon, 25 Nov 2024, Martin Uecker wrote: > > Hi Richard, > > here is another version. This now just ignores the size for all trailing > arrays which I think is the right thing to do. It also modifies the lto > hashing which also seems to work (but needs more testing and I haven't > added tests to the patch yet). > > I also added back the missing comparison for the field offset. > > Instead of checking for flexible arrays members, for the exceptions > I now simply use RECORD_OR_UNION_TYPE_P. The alternative would be to > add the bit, but this would then be for any nested trailing array > - not FAM. > > But I am not entirely sure adding the bit is worth it, as checking > for structures seems to be good enough - maybe for "verify_type" > we could then spend more time and check recursively for a trailing > array. > > Before I continue with this, what do you think? > > Comments and tests still need some improvements. > > Martin > > > > Fix type compatibility for types with flexible array member > [PR113688,PR114014,PR117724] > > verify_type checks the compatibility of TYPE_CANONICAL using > gimple_canonical_types_compatible_p. But it is stricter than what the > C standard requires and therefor inconsistent with how TYPE_CANONICAL is > set > in the C FE. Here, the logic is changed to ignore the array size when one > is the last element of a structure or union. To not get errors because of > an inconsistent number of members, zero-sized arrays are not ignored > anymore > when checking the fields of a struct (which is stricter than what was done > before). Finally, exceptions are added that allow the TYPE_MODE of a type > with an array as last member to differ from another compatible type. > > PR c/113688 > PR c/114014 > PR c/117724 > > gcc/ChangeLog: > * tree.cc (gimple_canonical_types_compatible_p): Revise > logic for types with array as last member. > (verify_type): Add exceptions. > > gcc/lto/ChangeLog: > * lto-common.cc (hash_canonical_type): Add exceptions. > > gcc/testsuite/ChangeLog: > * gcc.dg/pr113688.c: New test. > * gcc.dg/pr114014.c: New test. > * gcc.dg/pr117724.c: New test. > > diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc > index 86a309f92b4..f65a9d1c7b6 100644 > --- a/gcc/lto/lto-common.cc > +++ b/gcc/lto/lto-common.cc > @@ -254,7 +254,8 @@ hash_canonical_type (tree type) > checked. */ > code = tree_code_for_canonical_type_merging (TREE_CODE (type)); > hstate.add_int (code); > - hstate.add_int (TYPE_MODE (type)); > + if (!RECORD_OR_UNION_TYPE_P (type)) > + hstate.add_int (TYPE_MODE (type)); > > /* Incorporate common features of numerical types. */ > if (INTEGRAL_TYPE_P (type) > @@ -332,7 +333,11 @@ hash_canonical_type (tree type) > && (! DECL_SIZE (f) > || ! integer_zerop (DECL_SIZE (f)))) > { > - iterative_hash_canonical_type (TREE_TYPE (f), hstate); > + tree t = TREE_TYPE (f); > + if (!TREE_CHAIN (f) > + && TREE_CODE (t) == ARRAY_TYPE) > + t = TREE_TYPE (t);
I think this is fine, but it warrants a comment on why we do this. > + iterative_hash_canonical_type (t, hstate); > nf++; > } > > diff --git a/gcc/testsuite/gcc.dg/pr113688.c b/gcc/testsuite/gcc.dg/pr113688.c > new file mode 100644 > index 00000000000..8dee8c86f1b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr113688.c > @@ -0,0 +1,8 @@ > +/* { dg-do compile } */ > +/* { dg-options "-g" } */ > + > +struct S{int x,y[1];}*a; > +int main(void){ > + struct S{int x,y[];}; > +} > + > diff --git a/gcc/testsuite/gcc.dg/pr114014.c b/gcc/testsuite/gcc.dg/pr114014.c > new file mode 100644 > index 00000000000..ab783f4f85d > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr114014.c > @@ -0,0 +1,14 @@ > +/* PR c/114014 > + * { dg-do compile } > + * { dg-options "-std=c23 -g" } */ > + > +struct r { > + int a; > + char b[]; > +}; > +struct r { > + int a; > + char b[0]; > +}; /* { dg-error "redefinition" } */ > + > + > diff --git a/gcc/testsuite/gcc.dg/pr117724.c b/gcc/testsuite/gcc.dg/pr117724.c > new file mode 100644 > index 00000000000..d631daeb644 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr117724.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-g" } */ > + > +struct { > + unsigned long len; > + unsigned long size; > + char data[]; > +}; /* { dg-warning "unnamed struct" } */ > +struct { > + struct { > + unsigned long len; > + unsigned long size; > + char data[6]; > + }; > +}; /* { dg-warning "unnamed struct" } */ > + > diff --git a/gcc/tree.cc b/gcc/tree.cc > index 125f38b1cfa..30e3b4531a1 100644 > --- a/gcc/tree.cc > +++ b/gcc/tree.cc > @@ -13907,8 +13907,12 @@ gimple_canonical_types_compatible_p (const_tree t1, > const_tree t2, > || TREE_CODE (t1) == NULLPTR_TYPE) > return true; > > - /* Can't be the same type if they have different mode. */ > - if (TYPE_MODE (t1) != TYPE_MODE (t2)) > + /* Can't be compatible types if they have different mode. We allow > + mismatching modes for structures or unions because they could end > + with flexible array member. */ > + if (!RECORD_OR_UNION_TYPE_P (t1) > + && !RECORD_OR_UNION_TYPE_P (t2) > + && (TYPE_MODE (t1) != TYPE_MODE (t2))) > return false; I think this is OK. There's also the point that the mode of a RECORD_OR_UNION_TYPE_P is really only used for RTL expansion purposes. You might argue checking !RECORD_OR_UNION_TYPE_P (t1) is enough, duplicating the check into the relevant switch cases (and omitting from record/union type) might be clearer. The mode check was supposed to be a cheap early test. I'd go with just checking !RECORD_OR_UNION_TYPE_P on t1 I guess. I'll note that for ARRAY_TYPE the mode is similarly "irrelevant". > /* Non-aggregate types can be handled cheaply. */ > @@ -13959,7 +13963,7 @@ gimple_canonical_types_compatible_p (const_tree t1, > const_tree t2, > { > case ARRAY_TYPE: > /* Array types are the same if the element types are the same and > - the number of elements are the same. */ > + minimum and maximum index are the same. */ > if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1), TREE_TYPE > (t2), > trust_type_canonical) > || TYPE_STRING_FLAG (t1) != TYPE_STRING_FLAG (t2) > @@ -14053,23 +14057,35 @@ gimple_canonical_types_compatible_p (const_tree t1, > const_tree t2, > f1 || f2; > f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2)) > { > - /* Skip non-fields and zero-sized fields. */ > - while (f1 && (TREE_CODE (f1) != FIELD_DECL > - || (DECL_SIZE (f1) > - && integer_zerop (DECL_SIZE (f1))))) > + /* Skip non-fields. We used to skip zero-sized fields, but not > + allow them only at the end. */ I think this now makes int i : 32; int : 0; int j; and int i; int j; incompatible? Did you remove this because of actual issues or because you thought it shouldn't matter? The underlying idea is of course that a zero-sized field can be ignored and we want to conservatively treat structs which only differ in added zero-size fields as compatible (with LTO, for cross-language TBAA inter-operability). So the bitfield case might not be the only relevant thing. > + while (f1 && (TREE_CODE (f1) != FIELD_DECL)) > f1 = TREE_CHAIN (f1); > - while (f2 && (TREE_CODE (f2) != FIELD_DECL > - || (DECL_SIZE (f2) > - && integer_zerop (DECL_SIZE (f2))))) > + while (f2 && (TREE_CODE (f2) != FIELD_DECL)) > f2 = TREE_CHAIN (f2); > if (!f1 || !f2) > break; So previously struct { int n; } and struct { int n; int data[]; } were not compatible by this check. I guess also a trailing int : 0 made structs incompatible. I think we want to continue to ignore intermediate zero-size fields but change the loops to while (f1 && (TREE_CODE (f1) != FIELD_DECL || (DECL_SIZE (f1) && integer_zerop (DECL_SIZE (f1)) && TREE_CHAIN (f1)))) so we preserve the last field even when zero-size? It's a bit odd we treat incomplete (!DECL_SIZE) fields specially, but those should only appear as last field, maybe || (TREE_CHAIN (f1) && integer_zerop (DECL_SIZE (f1)) should work to avoid ICEing for those. But eventually we get into this function for not laid out types where the assumption might not hold ... > + > + tree t1 = TREE_TYPE (f1); > + tree t2 = TREE_TYPE (f2); > + > + /* Special case for array members at the end. */ .. because? > + if (TREE_CHAIN (f1) == NULL_TREE > + && TREE_CHAIN (f2) == NULL_TREE > + && TREE_CODE (t1) == ARRAY_TYPE > + && TREE_CODE (t2) == ARRAY_TYPE > + && TYPE_REVERSE_STORAGE_ORDER (t1) == > TYPE_REVERSE_STORAGE_ORDER (t2) > + && TYPE_NONALIASED_COMPONENT (t1) == TYPE_NONALIASED_COMPONENT > (t2) recursing on ARRAY_TYPE would have compared TYPE_STRING_FLAG > + && gimple_compare_field_offset (f1, f2) > + && gimple_canonical_types_compatible_p > + (TREE_TYPE (t1), TREE_TYPE (t2), > + trust_type_canonical)) > + ; > /* The fields must have the same name, offset and type. */ > - if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2) > + else if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2) any reason you elide DECL_NONADDRESSABLE_P comparing above? I think this is semantically relevant (only used from Ada). Given you basically only elide comparing of TYPE_DOMAIN I wonder if we shouldn't add another parameter to gimple_canonical_types_compatible_p indicating whether to disregard that? Or split out a helper function to compare the array-type-but-not-domain bits to make those the same between recursing or open-coding. I'd suggest to do /* The fields must have the same offset. */ if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2) || !gimple_compare_field_offset (f1, f2)) return false; upfront to isolate the sole difference on trailing ARRAY_TYPE handling better. I think the new mode handling makes sense in general, the LTO bits should work fine. The zero-size field ignoring stuff may have too many unrelated changes. So definitely better now. Thanks, Richard. > || !gimple_compare_field_offset (f1, f2) > || !gimple_canonical_types_compatible_p > - (TREE_TYPE (f1), TREE_TYPE (f2), > - trust_type_canonical)) > + (t1, t2, trust_type_canonical)) > return false; > } > > @@ -14213,6 +14229,10 @@ verify_type (const_tree t) > } > > if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t) > + /* We allow a mismatch for structure or union because of > + flexible array members. */ > + && !RECORD_OR_UNION_TYPE_P (t) > + && !RECORD_OR_UNION_TYPE_P (TYPE_CANONICAL (t)) > && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t))) > { > error ("%<TYPE_MODE%> of %<TYPE_CANONICAL%> is not compatible"); > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)