Hi Nicholas

The final design is now waiting on Dan, but it is always interesting to see
other ideas.

Like you, I rejected the parent/child technique. However, my proposed
solution did not use any links at all, because it relies on the garbage
collection system to determine when a shared buffer has only one reference.

It worked as follows (ignoring substrings for simplicity of explanation):

Each string header (the equivalent of a scalar in this regard, as it points
to a specific string instance) points to the shared buffer
The string header contains a COW flag, set in both source and destination
headers when a string is [not] copied
The buffer also contains a flag, used only during garbage collection and
always initialised to zero
When a string needs to be changed, we simply copy the data into a new buffer
and clear the COW flag in the header

During the memory pool compaction phase of the GC system, the first header
encountered that points to a specific buffer will cause the buffer to be
collected as normal; the special flag is set in the old copy of the buffer
to say that this buffer has been relocated, and the address of this first
header is stored in the old buffer. The COW flag is cleared at this time.
If another header is found that points to a buffer that has already been
moved, no copying takes place, and the header is simply corrected to point
to the new location of the buffer. The first header has its COW flag set
again.

The result is that the last header of a COWed string will still believe that
the buffer is shared until a GC collection run occurs, and therefore could
result in buffers being copied unnecessarily. Your system eliminates this
problem; however, I believe that Dan may be averse to using a linked list -
we'll see.

--
Peter Gibbs
EmKel Systems


Reply via email to