(I am way, way behind on reading this mailing list.)

Assuming I understand your scheme properly, it seems incorrect to me.

An object must be finalized during the first 'sweep 0' after it is no longer reachable from the root set. (And no object may be finalized while it is still reachable, but your scheme preserves that most fundamental property of GC.)

So let's say I have objects A, B, and C, where A references B references C. C requires timely finalization. After one DOD run, we have B and C in the high priority set. Now A becomes unreachable and we do a 'sweep 0'. We scan the high priority set and reach C, so we finish. But there are no references to B, and hence no references to C, and yet we did not finalize C. Oops.

Am I misunderstanding something?

Still, the basic approach seems promising. The key insight is that we can terminate a 'sweep 0' as soon as all impatient objects have been found, and we can use the knowledge gained from one DOD run to decrease the expected time required to find all those impatient objects in the next run.

Perhaps we keep (arbitrary) back pointers as we do DOD, and when we reach an object requiring timely finalization, we store the entire chain from the root set somewhere, as well as setting your high priority flag on every object in the chain? Then, 'sweep 0' would do a full DOD, except that whenever we reach a high priority object, we collect all of the children together and check whether the next link in the stored chain from that object is present. If so, it is visited first. Otherwise, it's a normal full DOD, except as in your scheme we terminate early once all impatient objects have been marked live.

Any chain roots in the root set should be visited before the rest of the root set, of course. I am visualizing a DOD that works by always collecting all of the children at one level before recursing into any of them, and possibly choosing one or more of those children to visit before the rest.

Whoops. Upon further reflection, I realize that this has a rather large flaw -- you need to somehow jump over to a whole different search space whenever you reach an impatient object (other than the last), because otherwise the depth-first ordering will make you visit just about everything else in the graph anyway.

Umm... explicitly try to walk each of the saved chains in turn? I'm worried that the root nodes in the chains will quickly move a node or two away from the root set between 'sweep 0' calls, but perhaps that fear is unreasonable.

Try to save a "continuation" whenever you start chasing an impatient rabbit down its hole, and jump to that saved search spot when you find the impatient object? (But remember, you'd still have to come back to the original search area if you don't catch enough rabbits.) That would be a bit tricky, since many chains will share a common prefix, so you have to figure out which continuation to choose that won't skip other rabbit trails.

Wait... are we still constructing an explicit linked list of objects to visit? Then perhaps instead of appending discovered children to the end of the list (or the front -- are we doing BFS or DFS?), we use the "high priority" flag -- objects with the flag set get inserted at the front of the list; everything else gets inserted at the end. This naturally results in all the stored chains being visited first (assuming they're still valid), but still visiting every object only once. We don't even need to explicitly store the chains anywhere, we just have to traverse the back pointers from impatient objects and set the high priority flag on everything encountered on the path back to the root set.

Ok, that seems simple enough that either there's some huge problem with it, or it's a standard technique that everyone besides me already knows. Which is it? :-)


Reply via email to