On Sun, Jan 3, 2010 at 6:46 AM, Joshua Haberman <jhaber...@gmail.com> wrote: > The aliasing policies that GCC implements seem to be more strict than > what is in the C99 standard. I am wondering if this is true or whether > I am mistaken (I am not an expert on the standard, so the latter is > definitely possible). > > The relevant text is: > > An object shall have its stored value accessed only by an lvalue > expression that has one of the following types: > > * a type compatible with the effective type of the object, > [...] > * an aggregate or union type that includes one of the aforementioned > types among its members (including, recursively, a member of a > subaggregate or contained union), or
Literally interpreting this sentence the way you do removes nearly all advantages of type-based aliasing that you have when dealing with disambiguating a pointer dereference vs. an object reference and thus cannot be the desired interpretation (and thus we do not allow this). It basically would force us to treat *ptr vs. Obj as *ptr vs. *(Obj *)ptr2. > To me this allows the following: > > int i; > union u { int x; } *pu = (union u*)&i; > printf("%d\n", pu->x); > > In this example, the object "i", which is of type "int", is having its > stored value accessed by an lvalue expression of type "union u", which > includes the type "int" among its members. > > I have seen other articles that interpret the standard in this way. > See section "Casting through a union (2)" from this article, which > claims that casts of this sort are legal and that GCC's warnings > against them are false positives: > http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html Yes, this article contains many mistakes but the author failed to listen. > However, this appears to be contrary to GCC's documentation. From the > manpage: > > Similarly, access by taking the address, casting the resulting > pointer and dereferencing the result has undefined behavior, even > if the cast uses a union type, e.g.: > > int f() { > double d = 3.0; > return ((union a_union *) &d)->i; > } > > I have also been able to experimentally verify that GCC will mis-compile > this fragment if we expect the behavior the standard specifies: > int g; > struct A { int x; }; > int foo(struct A *a) { > if(g) a->x = 5; > return g; > } > > With GCC 4.3.3 -O3 on x86-64 (Ubuntu), g is only loaded once: > > 0000000000000000 <foo>: > 0: 8b 05 00 00 00 00 mov eax,DWORD PTR [rip+0x0] # 6 > <foo+0x6> > 6: 85 c0 test eax,eax > 8: 74 06 je 10 <foo+0x10> > a: c7 07 05 00 00 00 mov DWORD PTR [rdi],0x5 > 10: f3 c3 repz ret > > But this is incorrect if foo() was called as: > > foo((struct A*)&g); > > Here is another example: > > struct A { int x; }; > struct B { int x; }; > int foo(struct A *a, struct B *b) { > if(a->x) b->x = 5; > return a->x; > } > > When I compile this, a->x is only loaded once, even though foo() > could have been called like this: > > int i; > foo((struct A*)&i, (struct B*)&i); > > From this I conclude that GCC diverges from the standard, in that it does not > allow casts of this sort. In one sense this is good (because the policy GCC > implements is more aggressive, and yet still reasonable) but on the other hand > it means (if I am not mistaken) that GCC will incorrectly optimize strictly > conforming programs. Correct. GCC follows its own documentation here, not some random websites and maybe not the strict reading of the standard. There are other corner-cases where it does so, namely with the union type rule (which I fail to come up with a std reference at the moment). Richard. > Clarifications are most welcome! > > Josh > >