On Fri, Jun 27, 2008 at 2:31 PM, John Engelhart
<[EMAIL PROTECTED]> wrote:
> Lesson #2:  Since there is so little documentation about the GC system, this
> involves a lot of speculation, but I think it summarizes what's really going
> on.  This all started with an effort to keep a __weak reference to a passed
> in string that was used to initialize an element in a cache.  When the cache
> was checked, if that weak reference was NULL, then the cache line is invalid
> and should be cleared.  The cache consisted of a global array of elements,
> selection was done via KEY_STRING_HASH % CACHE_SIZE, and everything was
> under a mutex lock.  An approximation of the cache is:
>
> typedef struct {
>  NSString *aString;
>  __weak NSString *aWeakString;
>  NSInteger anInteger;
> } MYStructType;
>
> MYStructType globalStructTypeArray[42]; // <-- Global!
>
> Simple, right?  That's how it always starts out...  The first problem
> encountered was:
>
> [EMAIL PROTECTED] /tmp% gcc -o Global_GC Global_GC.m -framework Foundation
> -fobjc-gc
> Global_GC.m:14: warning: __weak attribute cannot be specified on a field
> declaration
>
> (The attached file contains the full example demonstrating the problem.)
>
> I'm not really sure what this means, and I don't recall reading anything in
> the documentation that would suggest anything is amiss.  I never actually
> managed to figure out what, if any, problem this causes because it quickly
> became apparent that there was a much bigger problem that needed dealing
> with:

Speculation: __weak needs a read-barrier as well as a write-barrier,
and with structs people have a long history of reading them without
going through the accessor. This isn't generally a problem for
__strong and write barriers because for all of this to work you need
to make sure that the memory for MYStructType is GC scanned anyway.

> The pointer to 'aString' in the above (or any of my other __strong pointers
> in my actual code) were clearly not being treated as __strong, and the GC
> system was reclaiming them causing all sorts of fun and random crashes.
>
> The documentation states: The initial root set of objects is comprised of
> global variables, stack variables, and objects with external references.
> These objects are never considered as garbage.

This is kind of a lie since not ALL global memory is treated as
collectable. Hence the need for special assigns.

> Putting the pieces together, it became obvious what was really going on.
>  The two commented out lines in the example that update the global variable
> are the key to the mystery and make everything work as expected.
>
> It turns out that when the documentation says that "root set of objects is
> comprised of global variables", it's true, but probably not in the way that
> you think it is.
>
> It would 'seem' that global variables are only __strong when the compiler
> can reason that you're referring to a global variable directly. In this
> particular case, that would be:
>
> globalStructTypeArray[23].aString = newString;

Speculation: another way to think of it is that not all global memory
is considered a collectable root until you've first used it. That is,
on the first call to objc_assign_global, the pointer is added to the
list of collectable roots. It appears to be a lazy sort of system.

> They are not strong when you refer to them indirectly (even though write
> barriers are clearly being performed), such as:
>
> update(&globalStructTypeArray[23], newString);
>
> update(MYStructType *aStructType, NSString *string) {
>  aStructType->aString = string;
> }
>
> Looking at the assembly output, the reason becomes clear:
>
> The write barrier used by the first, direct reference is objc_assign_global,
> while the write barrier used by the indirect reference in update is
> objc_assign_strongCast.
>
> This is probably an important point that you should consider if you're
> depending on global variables being truly __strong.  No doubt someone here
> will explain that this isn't a bug, it's just that you shouldn't reference a
> global variable via a pointer (this is sarcastic for the challenged).

You shouldn't reference a global variable via a pointer! Kidding.

The problem is essentially the same as the one in this code:

class Foo {
  public:
  NSString* fieldA;
  int fieldB;

  Foo( NSString *_fieldA, int _fieldB ) : fieldA( _fieldA ), fieldB(
_fieldB ) {}
};

Foo *f = new Foo( @"Something strong", 42 );

IIRC, you'll also find that here f->fieldA is collected way before you
expect. Only this time, there's plenty of emails about how to fix it.
The problem is that ::new returns a block of non-GC memory. So even
though the write barriers are setup properly, f->fieldA is in a
non-scanned region. See here:

http://lists.apple.com/archives/Cocoa-dev/2008/Feb/msg00435.html

In your case, globalStructTypeArray is also in a non-scanned region,
which is why the compiler uses the special _global assign. But you've
hidden the global nature from the compiler by using the pointer, so it
fails.

> I'll leave you to ponder the implications of the above.  The next nut to
> crack after that one is:  __weak pointers must be read via a wrapper
> function (objc_read_weak), and you can't tell if the pointer passed in is
> actually a __weak reference to, say, a NSString, then do you have to assume
> the worst that every pointer passed in may potentially be __weak and
> therefore for safety must be wrapped in a call to objc_read_weak()?  Talk
> amongst yourselves.
>
> Since I can't arrange for my code to always use the GC variable directly,
> and I don't have an answer wrt/ to the "always assume __weak" question, I've
> pretty much abandoned GC for this particular use.

Maybe I misunderstand. If you have this code:

void foo() {
  __weak NSString *aResult = nil;

  aResult = getNextResult();
  bar( aResult );
}

void bar( NSString *bResult ) {
  doSomething( bResult );
}

ISTM that when you call bar(), the pointer passed in now has a strong
reference to it (bResult). bar() shouldn't care if its argument
originally came from a weakly held pointer; the call to
objc_read_weak() is made when forming the argument stack, not later
inside bar(). I don't know this for sure, but it would be insane to
have it work any other way.

GC is kind of cool, but I think in C it's more hassle than it's worth.
All these problems arise because some of the memory is GC and some
isn't. In Java and C# (IIRC) it's all GC. Does any language do this
half is/isn't well?
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to