Nicholas Clark <[EMAIL PROTECTED]> writes: > How does the regexp replacement engine cope with this? By implementing > all replacements as substr() type ops? > [or behaving as if it implements... whilst cheating and doing it direct for > scalars it understands?] > > Or don't we need to work this out at this time? This sounds like something that *does* need working out - it is essentially the problem of defining the string-related parts of the vtable API (which I seem to recall is where this thread starrted, anyway!) A possible approach would be to have per-scalar attribute(s) saying whether the SV's string value is * simple bytes * variable length bytes (eg UTF8) * complex (eg embedded attributes) with different bits of the "PV" part of the API legal for each of these options. For simple bytes, you can use the "here's a pointer to a buffer" approach, which the regex engine etc can handle efficiently; for the others there are more fancy (and less efficient) access methods. Then chuck in more complications for shared copy-on-write strings, etc etc. Anyone want to volunteer to knock up an API in the next 5 minutes ???? :-)