On Wed, Jan 06, 2010 at 04:09:11AM +0000, Joshua Haberman wrote:
> Erik Trulsson <ertr1013 <at> student.uu.se> writes:
> > On Sun, Jan 03, 2010 at 05:46:48AM +0000, Joshua Haberman wrote:
> > > The aliasing policies that GCC implements seem to be more strict than
> > > what is in the C99 standard.  I am wondering if this is true or whether
> > > I am mistaken (I am not an expert on the standard, so the latter is
> > > definitely possible).
> > >-
> > > The relevant text is:
> > >-
> > >   An object shall have its stored value accessed only by an lvalue
> > >   expression that has one of the following types:
> > >-
> > >   * a type compatible with the effective type of the object,
> > >   [...]
> > >   * an aggregate or union type that includes one of the aforementioned
> > >     types among its members (including, recursively, a member of a
> > >     subaggregate or contained union), or
> > >-
> > > To me this allows the following:
> > >-
> > >   int i;
> > >   union u { int x; } *pu = (union u*)&i;
> > >   printf("%d\n", pu->x);
> > >-
> > > In this example, the object "i", which is of type "int", is having its
> > > stored value accessed by an lvalue expression of type "union u", which
> > > includes the type "int" among its members.
> >-
> > Even with your interpretation of the C99 standard that example would be
> > allowed only if  '*pu' is a valid lvalue of type  'union u'.  (Since pu->x
> > is equivalent to (*pu).x)
> >-
> > First of all the conversion  (union u*)&i is valid only if the alignment
> > of 'i' is suitable for an object of type 'union u'.  Lets assume that is the
> > case. (Otherwise just making that conversion would result in undefined
> > behaviour.)  (See 6.3.2.3 clause 7.)
> 
> This is true.  You could get around this particular point by saying:
> 
>   int *i = malloc(sizeof(*i));
>   *i = 5;
>   union u { int x; } *pu = (union u*)i;
>   printf("%d\n", pu->x);
> 
> ...since the return from malloc() is guaranteed to be suitably aligned for
> any object (7.20.3).  But your point is taken.
> 
> > There is however no guarantee that the conversion yields a valid "pointer to
> > union u".  If not then dereferencing it (with the expression '*pu') has
> > undefined behaviour. (See 6.5.3.2 clause 4)
> 
> I think this is a bit of a stretch.  It is true that 6.5.3.2 says that
> dereferencing invalid values has undefined behavior.  But if you are
> saying that the standard has to explicitly say that a pointer conversion
> will not result in an invalid value (even when suitably aligned), then
> the following is also undefined:
> 
>   int i;
>   unsigned int *pui = (unsigned int*)&i;
>   unsigned int ui = *pui;
> 
> Andrew cited 6.3.2.3p2 as support for why this is defined, but that
> paragraph deals with qualifiers (const, volatile, and restrict).
> "unsigned" is not a qualifier.  There is no part of the standard that
> guarantees that a pointer conversion from "int*" to "unsigned int*" will
> not result in an invalid value.

(First I will assume that 'i' will be assigned some value, to make sure it
does not contain a trap-representation, or the assignment to 'ui' would have
undefined behaviour.)

I think 6.2.5 clause 27 is very relevant for this. It says that 'pointer to
int' and 'pointer to union' do not need to have the same representation as
each other.  It also seems that 'pointer to int' and 'pointer to unsigned
int' do not need to have the same representation requirements (at least I
cannot find anything that says that signed and unsigned variants are
compatible types.) (Which I must admit comes as a bit of a surprise to me.)

So, yes, that example does technically seem to be undefined (but I don't
know of any real-world implementation where it would not work as expected.)

> 
> > So your example contains undefined behaviour even without considering the
> > parts of 6.5 clause 7 that you quoted.
> >-
> > Moreover I think you are misinterpreting 6.5 clause 7 (which I concede is
> > fairly easy since it is not quite as unambiguous as one could wish).
> > I believe that paragraph should not be interpreted as automatically allowing
> > all accesses that correspond to one of the sorts listed.  Rather it should
> > be interpreted as saying that if an access is not included in that list then
> > it is not allowed, but even if it is included in that list there could be
> > other reasons why it is not allowed.  (I.e.  just as the attached footnote
> > suggests it is a list of what types of aliasing are allowed, not of which
> > pointers may be dereferenced.)
> 
> Interesting.  I think it is plausible that this is what the committee
> intended.  It like the committee wanted to give a heads-up to
> implementations that pointers to primitive types can alias pointers to
> those same types within unions and aggregates, just as a result of
> taking the address of a member.
> 
> In any case I definitely agree that this could be clarified, and I hope the
> standards committee will take this up for C1x.
> 
> Thank you for your thoughtful analysis.
> 
> Josh
> 

-- 
<Insert your favourite quote here.>
Erik Trulsson
ertr1...@student.uu.se

Reply via email to