On Tuesday 13 November 2001 01:20 pm, Jason Gloudon wrote: > On Mon, Nov 12, 2001 at 11:59:08PM -0500, Michael L Maraist wrote: > > 2) > > Can we assume that a "buffer object" is ONLY accessible by a single > > reverse-path-to-PMC? PMC's or array-buffers can point to other PMC's, so > > it's possible to have multiple paths from the root to an object, but I'm > > asking if, for example, we could use an algorithm that relied upon the > > fact that it only has one direct parent. > > That assumption would mean one could not directly share non-constant > buffers between strings. There has been talk of having copy-on-write > strings.
That's fine. I didn't really have strings in mind, and it's possible to treat them separately. We're already having to treat handles differently than buffer-objects. I'm also wanting to segregate lists of leaf-buffers, arrays, hashes, etc, so as to avoid putting switch-statements in the inner loop of the marker. Since each type would have a different inner loop, strings are confined to the same algorithms. So, any other ideas related specifically to PMC's? One interesting thing to note is that strings are leaf-objects. Furhter, copy-on-write would only be efficient if set for values that have more than one parent (otherwise each modification would require a copy). I haven't seen the discussion, so I don't know where it'll wind up, but if this is the case, then it seems to me that the only way to determine if copy-on-write should be applied (in an efficient manner) is via some variation of reference counting. Even if only 1-bit ref-counting is used. Such as: // In pseudo-code STR_REG[x] = newString("abc"); // f_copy_on_write = 0 STR_REG[x] _= newString("foo"); // check f_c_o_w; it was 0, so we modify string (possibly resizing) STR_REG[x] = PL_null_str; // "abc" eventually reclaimed by GC STR_REG[x] = newString("abc"); // f_c_o_w = 0 STR_REG[y] = STR_REG[x] // f_c_o_w = 1 STR_REG[x] _= newString("bar"); // check f_c_o_w; was 1, so we copy out // at this point, a GC-pass would count the instances of "abc" and notice that it only has one handle. It's f_c_o_w would be reset to 0. Even if the GC didn't perform this last stage, the system would work, it would just cause a greater percentage of multi-ref'd garbage and copying. Additionally, note that such a pass would insinuate a pre-GC-stage which resets f_c_o_w's to zero. Setting f_c_o_w to 1 all the time is slightly faster than performing an increment. Further, separate string vtables could be utilized to avoid the if-statements. (being a RO string instead of a RW string). Lastly, the overhead of a bit might as well be a byte, and the garbage reduction by actually performing full-reference counting (albeit with a GC fallback) might make this worthwhile. full ref-counting on strings would be harder to work with for XS-code though. -Michael