> Courtesy of Slashdot, 
> http://www.hastingsresearch.com/net/04-unicode-limitations.shtml
> 
> I'm not sure if this is an issue for us or not, as we're generally 
> language-neutral, and I don't see any technical issues with any of the 
> UTF-* encodings having headroom problems.

I think the author confused himself. The Unicode itself is not sufficient
to process human language, no matter how many characters it includes.
It is just an encoding.

Just take Chinese as example, only small percent (<<10%) of Chinese can
read more than 6000 characters. The biggest dictionary I know of includes
about 65000 characters, many of them even linguists can not agree with
each other. Some of the characters are kind of research result of the
authors. It is impossible to includes those characters into an 
international standard, such as Unicode. 

Unicode contains surrogates for future growth. We still have about 1M
code points left for allocation. Eventually it will include much more
characters than anyone can care about.

Hong

Reply via email to