On Fri, Jun 27, 2008 at 2:31 PM, John Engelhart <[EMAIL PROTECTED]> wrote: > Lesson #2: Since there is so little documentation about the GC system, this > involves a lot of speculation, but I think it summarizes what's really going > on. This all started with an effort to keep a __weak reference to a passed > in string that was used to initialize an element in a cache. When the cache > was checked, if that weak reference was NULL, then the cache line is invalid > and should be cleared. The cache consisted of a global array of elements, > selection was done via KEY_STRING_HASH % CACHE_SIZE, and everything was > under a mutex lock. An approximation of the cache is: > > typedef struct { > NSString *aString; > __weak NSString *aWeakString; > NSInteger anInteger; > } MYStructType; > > MYStructType globalStructTypeArray[42]; // <-- Global! > > Simple, right? That's how it always starts out... The first problem > encountered was: > > [EMAIL PROTECTED] /tmp% gcc -o Global_GC Global_GC.m -framework Foundation > -fobjc-gc > Global_GC.m:14: warning: __weak attribute cannot be specified on a field > declaration > > (The attached file contains the full example demonstrating the problem.) > > I'm not really sure what this means, and I don't recall reading anything in > the documentation that would suggest anything is amiss. I never actually > managed to figure out what, if any, problem this causes because it quickly > became apparent that there was a much bigger problem that needed dealing > with:
Speculation: __weak needs a read-barrier as well as a write-barrier, and with structs people have a long history of reading them without going through the accessor. This isn't generally a problem for __strong and write barriers because for all of this to work you need to make sure that the memory for MYStructType is GC scanned anyway. > The pointer to 'aString' in the above (or any of my other __strong pointers > in my actual code) were clearly not being treated as __strong, and the GC > system was reclaiming them causing all sorts of fun and random crashes. > > The documentation states: The initial root set of objects is comprised of > global variables, stack variables, and objects with external references. > These objects are never considered as garbage. This is kind of a lie since not ALL global memory is treated as collectable. Hence the need for special assigns. > Putting the pieces together, it became obvious what was really going on. > The two commented out lines in the example that update the global variable > are the key to the mystery and make everything work as expected. > > It turns out that when the documentation says that "root set of objects is > comprised of global variables", it's true, but probably not in the way that > you think it is. > > It would 'seem' that global variables are only __strong when the compiler > can reason that you're referring to a global variable directly. In this > particular case, that would be: > > globalStructTypeArray[23].aString = newString; Speculation: another way to think of it is that not all global memory is considered a collectable root until you've first used it. That is, on the first call to objc_assign_global, the pointer is added to the list of collectable roots. It appears to be a lazy sort of system. > They are not strong when you refer to them indirectly (even though write > barriers are clearly being performed), such as: > > update(&globalStructTypeArray[23], newString); > > update(MYStructType *aStructType, NSString *string) { > aStructType->aString = string; > } > > Looking at the assembly output, the reason becomes clear: > > The write barrier used by the first, direct reference is objc_assign_global, > while the write barrier used by the indirect reference in update is > objc_assign_strongCast. > > This is probably an important point that you should consider if you're > depending on global variables being truly __strong. No doubt someone here > will explain that this isn't a bug, it's just that you shouldn't reference a > global variable via a pointer (this is sarcastic for the challenged). You shouldn't reference a global variable via a pointer! Kidding. The problem is essentially the same as the one in this code: class Foo { public: NSString* fieldA; int fieldB; Foo( NSString *_fieldA, int _fieldB ) : fieldA( _fieldA ), fieldB( _fieldB ) {} }; Foo *f = new Foo( @"Something strong", 42 ); IIRC, you'll also find that here f->fieldA is collected way before you expect. Only this time, there's plenty of emails about how to fix it. The problem is that ::new returns a block of non-GC memory. So even though the write barriers are setup properly, f->fieldA is in a non-scanned region. See here: http://lists.apple.com/archives/Cocoa-dev/2008/Feb/msg00435.html In your case, globalStructTypeArray is also in a non-scanned region, which is why the compiler uses the special _global assign. But you've hidden the global nature from the compiler by using the pointer, so it fails. > I'll leave you to ponder the implications of the above. The next nut to > crack after that one is: __weak pointers must be read via a wrapper > function (objc_read_weak), and you can't tell if the pointer passed in is > actually a __weak reference to, say, a NSString, then do you have to assume > the worst that every pointer passed in may potentially be __weak and > therefore for safety must be wrapped in a call to objc_read_weak()? Talk > amongst yourselves. > > Since I can't arrange for my code to always use the GC variable directly, > and I don't have an answer wrt/ to the "always assume __weak" question, I've > pretty much abandoned GC for this particular use. Maybe I misunderstand. If you have this code: void foo() { __weak NSString *aResult = nil; aResult = getNextResult(); bar( aResult ); } void bar( NSString *bResult ) { doSomething( bResult ); } ISTM that when you call bar(), the pointer passed in now has a strong reference to it (bResult). bar() shouldn't care if its argument originally came from a weakly held pointer; the call to objc_read_weak() is made when forming the argument stack, not later inside bar(). I don't know this for sure, but it would be insane to have it work any other way. GC is kind of cool, but I think in C it's more hassle than it's worth. All these problems arise because some of the memory is GC and some isn't. In Java and C# (IIRC) it's all GC. Does any language do this half is/isn't well? _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to [EMAIL PROTECTED]