2009/8/15 Ludovic Courtès <l...@gnu.org>: > ** Incomplete support for Unicode characters and strings > > Internally, strings are now represented either in the `latin-1' > encoding, one byte per character, or in UTF-32, with four bytes per > character.
Will this eventually move to UTF8? European languages typically use only a small handful of non-latin symbols, typically just misc punctuation. I recent dump of voice-of-america radio broadcasts I ran through guile used misc UTF8 punctuation ... backwards-facing double-quotes, ellipsis, etc. I'd hate to see this common case blow up to 32-bits per char just to accommodate stray punctuation. --linas