Erik Trulsson <ertr1013 <at> student.uu.se> writes: > On Sun, Jan 03, 2010 at 05:46:48AM +0000, Joshua Haberman wrote: > > The aliasing policies that GCC implements seem to be more strict than > > what is in the C99 standard. I am wondering if this is true or whether > > I am mistaken (I am not an expert on the standard, so the latter is > > definitely possible). > >- > > The relevant text is: > >- > > An object shall have its stored value accessed only by an lvalue > > expression that has one of the following types: > >- > > * a type compatible with the effective type of the object, > > [...] > > * an aggregate or union type that includes one of the aforementioned > > types among its members (including, recursively, a member of a > > subaggregate or contained union), or > >- > > To me this allows the following: > >- > > int i; > > union u { int x; } *pu = (union u*)&i; > > printf("%d\n", pu->x); > >- > > In this example, the object "i", which is of type "int", is having its > > stored value accessed by an lvalue expression of type "union u", which > > includes the type "int" among its members. >- > Even with your interpretation of the C99 standard that example would be > allowed only if '*pu' is a valid lvalue of type 'union u'. (Since pu->x > is equivalent to (*pu).x) >- > First of all the conversion (union u*)&i is valid only if the alignment > of 'i' is suitable for an object of type 'union u'. Lets assume that is the > case. (Otherwise just making that conversion would result in undefined > behaviour.) (See 6.3.2.3 clause 7.)
This is true. You could get around this particular point by saying: int *i = malloc(sizeof(*i)); *i = 5; union u { int x; } *pu = (union u*)i; printf("%d\n", pu->x); ...since the return from malloc() is guaranteed to be suitably aligned for any object (7.20.3). But your point is taken. > There is however no guarantee that the conversion yields a valid "pointer to > union u". If not then dereferencing it (with the expression '*pu') has > undefined behaviour. (See 6.5.3.2 clause 4) I think this is a bit of a stretch. It is true that 6.5.3.2 says that dereferencing invalid values has undefined behavior. But if you are saying that the standard has to explicitly say that a pointer conversion will not result in an invalid value (even when suitably aligned), then the following is also undefined: int i; unsigned int *pui = (unsigned int*)&i; unsigned int ui = *pui; Andrew cited 6.3.2.3p2 as support for why this is defined, but that paragraph deals with qualifiers (const, volatile, and restrict). "unsigned" is not a qualifier. There is no part of the standard that guarantees that a pointer conversion from "int*" to "unsigned int*" will not result in an invalid value. > So your example contains undefined behaviour even without considering the > parts of 6.5 clause 7 that you quoted. >- > Moreover I think you are misinterpreting 6.5 clause 7 (which I concede is > fairly easy since it is not quite as unambiguous as one could wish). > I believe that paragraph should not be interpreted as automatically allowing > all accesses that correspond to one of the sorts listed. Rather it should > be interpreted as saying that if an access is not included in that list then > it is not allowed, but even if it is included in that list there could be > other reasons why it is not allowed. (I.e. just as the attached footnote > suggests it is a list of what types of aliasing are allowed, not of which > pointers may be dereferenced.) Interesting. I think it is plausible that this is what the committee intended. It like the committee wanted to give a heads-up to implementations that pointers to primitive types can alias pointers to those same types within unions and aggregates, just as a result of taking the address of a member. In any case I definitely agree that this could be clarified, and I hope the standards committee will take this up for C1x. Thank you for your thoughtful analysis. Josh