> Courtesy of Slashdot, > http://www.hastingsresearch.com/net/04-unicode-limitations.shtml > > I'm not sure if this is an issue for us or not, as we're generally > language-neutral, and I don't see any technical issues with any of the > UTF-* encodings having headroom problems. I think the author confused himself. The Unicode itself is not sufficient to process human language, no matter how many characters it includes. It is just an encoding. Just take Chinese as example, only small percent (<<10%) of Chinese can read more than 6000 characters. The biggest dictionary I know of includes about 65000 characters, many of them even linguists can not agree with each other. Some of the characters are kind of research result of the authors. It is impossible to includes those characters into an international standard, such as Unicode. Unicode contains surrogates for future growth. We still have about 1M code points left for allocation. Eventually it will include much more characters than anyone can care about. Hong
- Should we care much about this Unicode-ish criticism? Dan Sugalski
- Re: Should we care much about this Unicode-ish crit... Russ Allbery
- Re: Should we care much about this Unicode-ish crit... Simon Cozens
- Re: Should we care much about this Unicode-ish crit... Dan Sugalski
- Re: Should we care much about this Unicode-ish ... Simon Cozens
- Re: Should we care much about this Unicode-ish ... David L. Nicol
- Re: Should we care much about this Unicode-ish crit... Hong Zhang
- Re: Should we care much about this Unicode-ish crit... Russ Allbery
- RE: Should we care much about this Unicode-ish crit... Hong Zhang
- Re: Should we care much about this Unicode-ish crit... Bart Lateur
- Re: Should we care much about this Unicode-ish ... Simon Cozens
- RE: Should we care much about this Unicode-ish crit... Dan Sugalski
- Re: Should we care much about this Unicode-ish crit... Russ Allbery
- Re: Should we care much about this Unicode-ish crit... Dan Sugalski
- Re: Should we care much about this Unicode-ish crit... Bryan C . Warnock
- Re: Should we care much about this Unicode-ish crit... Simon Cozens
- Re: Should we care much about this Unicode-ish ... Bryan C . Warnock