I've long wondered how GCC deals with C99 strict aliasing rules when
compiling Objective-C code.  There's no language spec for Objective-C,
other than the written prose description of the language that Apple
provides (which, until recently, has been virtually unmodified since
it's NeXT origins), so there's no definitive source to turn to to help
answer these kinds of questions.

I recently had some time to dig in to the compiler to try to find an
answer.  Keep in mind I'm no expert on GCC's internals.

The problem is roughly this:  How does C99's strict aliasing rules
interact with pointers to Objective-C objects.  Complicating matters,
how do those rules apply to the complexities that object-oriented
polymorphism causes?  As an example,
id object;

NSString *string;
NSMutableString *mutableString;

Objective-C (and object-oriented principles) say the following are legal:

object = string;
object = mutableString;
string = object;
string = mutableString;
mutableString = object;

And the following results in 'warning: assignment from distinct
Objective-C type', which is expected:

mutableString = string;

There's really two distinct ways of looking at this problem: What is
permitted under Objective-C and object-oriented principles (of which
there doesn't seem to be any problem), and what is permitted under C,
specifically, C99 and its strict-aliasing rules.

Without a language spec to guide us, we need to make some reasonable
assumptions at some point.  I've never seen a 'standards grade
language specification' definition of what a 'class' (ie, NSString in
the above) is.  Is it a genuinely opaque type that exists outside of C
defined 'types'?  Or is it really just syntactic sugar for a C struct?
 I've always subscribed to the syntactic sugar definition, as this is
how GCC represents things internally, and the way that every
Objective-C compiler I'm aware of has done things.

Working under the assumption that it really is just syntactic sugar,
this would seem to interact rather poorly with C99's 'new' type-based
strict-aliasing rules.  In fact, I have a hard time reconciling
Objective-C's free-wheeling type-punning ways with C99's
strict-aliasing rules, with the possible exception of recasting
Objective-C in terms of C's unions.  When I went looking through the
compiler sources to see how it managed with the problem, I was unable
to find anything that dealt with these problems.  Is that really the
case, or have I missed something?

Objective-C defines 'c_common_get_alias_set' as its language specific
alias set manager.  c_common_get_alias_set() seems(?) to only
implement C's strict aliasing rules, with no provisions for
Objective-C's needs.  To test this, I added the following to
c_common_get_alias_set to see what happens:

   if(((c_language == clk_objc) || (c_language == clk_objcxx)) &&
((TYPE_LANG_SPECIFIC (t) && (TYPE_LANG_SPECIFIC(t)->objc_info)) ||
objc_is_object_ptr(t))) {
     warning(OPT_Wstrict_aliasing, "Caught and returning 'can alias
anything' for objc type");
     return(0);
   }

right before the following line:

   if (c_language != clk_c || flag_isoc99)

Compiling with -O2 -Wstrict-aliasing causes an awful lot of 'Caught..'
messages to be returned.  Assuming that ivar access is really just
syntactic sugar for self->IVAR, then it would seem like there can be
times where strict-aliasing can cause the compiler to generate "bad
code", for some extremely complicated definition of what the correct
thing to do without the benefit of having a standards grade language
specification.

Can anyone with a much better understanding of GCC's internals comment on this?

Reply via email to