On Sun, Jan 3, 2010 at 6:46 AM, Joshua Haberman <jhaber...@gmail.com> wrote:
> The aliasing policies that GCC implements seem to be more strict than
> what is in the C99 standard.  I am wondering if this is true or whether
> I am mistaken (I am not an expert on the standard, so the latter is
> definitely possible).
>
> The relevant text is:
>
>  An object shall have its stored value accessed only by an lvalue
>  expression that has one of the following types:
>
>  * a type compatible with the effective type of the object,
>  [...]
>  * an aggregate or union type that includes one of the aforementioned
>    types among its members (including, recursively, a member of a
>    subaggregate or contained union), or

Literally interpreting this sentence the way you do removes nearly all
advantages of type-based aliasing that you have when dealing with
disambiguating a pointer dereference vs. an object reference
and thus cannot be the desired interpretation (and thus we do not allow this).

It basically would force us to treat *ptr vs. Obj as *ptr vs. *(Obj *)ptr2.

> To me this allows the following:
>
>  int i;
>  union u { int x; } *pu = (union u*)&i;
>  printf("%d\n", pu->x);
>
> In this example, the object "i", which is of type "int", is having its
> stored value accessed by an lvalue expression of type "union u", which
> includes the type "int" among its members.
>
> I have seen other articles that interpret the standard in this way.
> See section "Casting through a union (2)" from this article, which
> claims that casts of this sort are legal and that GCC's warnings
> against them are false positives:
>  http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

Yes, this article contains many mistakes but the author failed to listen.

> However, this appears to be contrary to GCC's documentation.  From the
> manpage:
>
>  Similarly, access by taking the address, casting the resulting
>  pointer and dereferencing the result has undefined behavior, even
>  if the cast uses a union type, e.g.:
>
>          int f() {
>            double d = 3.0;
>            return ((union a_union *) &d)->i;
>          }
>
> I have also been able to experimentally verify that GCC will mis-compile
> this fragment if we expect the behavior the standard specifies:
>  int g;
>  struct A { int x; };
>  int foo(struct A *a) {
>    if(g) a->x = 5;
>    return g;
>  }
>
> With GCC 4.3.3 -O3 on x86-64 (Ubuntu), g is only loaded once:
>
> 0000000000000000 <foo>:
>   0:   8b 05 00 00 00 00       mov    eax,DWORD PTR [rip+0x0]        # 6 
> <foo+0x6>
>   6:   85 c0                   test   eax,eax
>   8:   74 06                   je     10 <foo+0x10>
>   a:   c7 07 05 00 00 00       mov    DWORD PTR [rdi],0x5
>  10:   f3 c3                   repz ret
>
> But this is incorrect if foo() was called as:
>
>  foo((struct A*)&g);
>
> Here is another example:
>
>  struct A { int x; };
>  struct B { int x; };
>  int foo(struct A *a, struct B *b) {
>    if(a->x) b->x = 5;
>    return a->x;
>  }
>
> When I compile this, a->x is only loaded once, even though foo()
> could have been called like this:
>
>  int i;
>  foo((struct A*)&i, (struct B*)&i);
>
> From this I conclude that GCC diverges from the standard, in that it does not
> allow casts of this sort.  In one sense this is good (because the policy GCC
> implements is more aggressive, and yet still reasonable) but on the other hand
> it means (if I am not mistaken) that GCC will incorrectly optimize strictly
> conforming programs.

Correct.  GCC follows its own documentation here, not some random
websites and maybe not the strict reading of the standard.  There are
other corner-cases where it does so, namely with the union type rule
(which I fail to come up with a std reference at the moment).

Richard.

> Clarifications are most welcome!
>
> Josh
>
>

Reply via email to