2009/8/15 Ludovic Courtès <l...@gnu.org>:

>  ** Incomplete support for Unicode characters and strings
>
>  Internally, strings are now represented either in the `latin-1'
>  encoding, one byte per character, or in UTF-32, with four bytes per
>  character.

Will this eventually move to UTF8? European languages typically
use only a small handful of non-latin symbols, typically just misc
punctuation.  I recent dump of voice-of-america  radio broadcasts
I ran through guile used misc UTF8 punctuation ... backwards-facing
double-quotes, ellipsis, etc.  I'd hate to see this common case
blow up to 32-bits per char just to accommodate stray punctuation.

--linas


Reply via email to