> On 23 Oct 2014, at 14:44, Rowan Collins <rowan.coll...@gmail.com> wrote: > > Dmitry Stogov wrote on 21/10/2014 10:01: >> The "right" approach, would be extending zend_string with "encoding" and >> then adopting near all functions working with zend_string to take >> "encoding" into account. But, of course, this is going to lead to much more >> complicated solution (with some slowdown). > > Isn't that kind of what ext/mbstring does? > > I think that treating Unicode as nothing more than an encoding, and trying to > hide all its complexity from the user, is not particularly wise. Unicode > isn't just "ASCII, but bigger", so keeping the same API but making the > implementation "work" with more characters isn't really "Unicode support”.
I’m inclined to agree here. Having an encoding-aware zend_string vs. having a Unicode-aware string aren’t quite the same. Certain string operations are only possible for certain encodings, and by supporting any encoding we risk making things confusing. I’d rather we convert everything to Unicode. -- Andrea Faulds http://ajf.me/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php