On Sun, Mar 14, 2010 at 12:03 PM, Stan Vassilev <sv_for...@fmethod.com> wrote:
> UTF8 also takes 4 bytes for representing characters in the higher bit
> planes, as quite a lot of bits are lost for every char in order to describe
> how long the code point is, and when it ends and so on. This means
> memory-wise it may not be of big benefit to asian countries.

I remember Brian Aker saying that they chose to work internally with
UTF-8 for Drizzle. His explanation of it was that asian countries have
so much english content mixed in that on average even for them UTF-8
still had a lower footprint than UTF-16/32. I do not know where the
stats came from, but if it holds any truth it is worth considering.

Cheers,
Jordi

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to