On 2007-09-26, Deborah Goldsmith <[EMAIL PROTECTED]> wrote: > From an implementation point of view, UTF-16 is the most efficient > representation for processing Unicode.
This depends on the characteristics of the text being processed. Spacewise, English stays 1 byte/char in UTF-8. Most European languages go up to at most 2, and on average only a bit above 1. Greek and Cyrillic are 2 bytes/char. It's really only the Asian, African, Arabic, etc, that lose space-wise. It's true that time-wise there are definite issues in finding character boundaries. -- Aaron Denney -><- _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
