On Tuesday 13 November 2001 01:20 pm, Jason Gloudon wrote:
> On Mon, Nov 12, 2001 at 11:59:08PM -0500, Michael L Maraist wrote:
> > 2)
> > Can we assume that a "buffer object" is ONLY accessible by a single
> > reverse-path-to-PMC?  PMC's or array-buffers can point to other PMC's, so
> > it's possible to have multiple paths from the root to an object, but I'm
> > asking if, for example, we could use an algorithm that relied upon the
> > fact that it only has one direct parent.
>
> That assumption would mean one could not directly share non-constant
> buffers between strings. There has been talk of having copy-on-write
> strings.

That's fine.  I didn't really have strings in mind, and it's possible to 
treat them separately.  We're already having to treat handles differently 
than buffer-objects.  I'm also wanting to segregate lists of leaf-buffers, 
arrays, hashes, etc, so as to avoid putting switch-statements in the inner 
loop of the marker.  Since each type would have a different inner loop, 
strings are confined to the same algorithms.  So, any other ideas related 
specifically to PMC's?

One interesting thing to note is that strings are leaf-objects.  Furhter, 
copy-on-write would only be efficient if set for values that have more than 
one parent (otherwise each modification would require a copy).  I haven't 
seen the discussion, so I don't know where it'll wind up, but if this is the 
case, then it seems to me that the only way to determine if copy-on-write 
should be applied (in an efficient manner) is via some variation of reference 
counting.  Even if only 1-bit ref-counting is used.  Such as:

// In pseudo-code
STR_REG[x] = newString("abc"); // f_copy_on_write = 0
STR_REG[x] _= newString("foo"); // check f_c_o_w; it was 0, so we modify 
string (possibly resizing)
STR_REG[x] = PL_null_str; // "abc" eventually reclaimed by GC
STR_REG[x] = newString("abc"); // f_c_o_w = 0
STR_REG[y] = STR_REG[x] // f_c_o_w = 1
STR_REG[x] _= newString("bar"); // check f_c_o_w; was 1, so we copy out
 // at this point, a GC-pass would count the instances of "abc" and notice 
that it only has one handle.  It's f_c_o_w would be reset to 0.

Even if the GC didn't perform this last stage, the system would work, it 
would just cause a greater percentage of multi-ref'd garbage and copying.  
Additionally, note that such a pass would insinuate a pre-GC-stage which 
resets f_c_o_w's to zero.

Setting f_c_o_w to 1 all the time is slightly faster than performing an 
increment.  Further, separate string vtables could be utilized to avoid the 
if-statements. (being a RO string instead of a RW string).

Lastly, the overhead of a bit might as well be a byte, and the garbage 
reduction by actually performing full-reference counting (albeit with a GC 
fallback) might make this worthwhile.  full ref-counting on strings would be 
harder to work with for XS-code though.

-Michael

Reply via email to