On 21 October 2014 23:21:37 GMT+01:00, Andrea Faulds <a...@ajf.me> wrote:
>Make array-like indexing with [] be by >code points as you may be able to do that in constant time If the internal representation is UTF8, both code point and grapheme access require traversal unless you have some additional index structure. Both can be trivialised to byte access if you have detected and stored that the string is entirely ASCII, but otherwise you will nearly always have multiple widths within one string. If the internal representation is UTF16, code point access can be accelerated for any string containing only BMP characters (no surrogate pairs). The Perl6 concept of "NFG" attempts to extend that advantage to grapheme access, and to points outside the BMP. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php