On 2007-09-26, Deborah Goldsmith <[EMAIL PROTECTED]> wrote:
>  From an implementation point of view, UTF-16 is the most efficient  
> representation for processing Unicode.

This depends on the characteristics of the text being processed.
Spacewise, English stays 1 byte/char in UTF-8.  Most European languages
go up to at most 2, and on average only a bit above 1.  Greek and
Cyrillic are 2 bytes/char.  It's really only the Asian, African, Arabic,
etc, that lose space-wise.

It's true that time-wise there are definite issues in finding character
boundaries.

-- 
Aaron Denney
-><-

_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to