Re: What Unicode means to us

Dan Sugalski Fri, 13 Aug 2004 01:06:34 -0700

At 4:15 PM -0400 8/10/04, Aaron Sherman wrote:

On Mon, 2004-08-09 at 14:14, Dan Sugalski wrote:
 Additionally if we have source text which is
 Latin-n, EBCDIC, ASCII, or whatever we must be
 able to convert it with no loss to Unicode.
 (Which I believe is now doable with Unicode 4.0)
 Losslessly converting Unicode to
 ASCII/EBCDIC/whatever is *not* required, which is
 fine as it's theoretically (and often
 practically) impossible.
Can I suggest instead:
If we have source text which is comprised of a non-Unicode character-set we must be able to convert it with minimal loss to Unicode (minimal being defined as zero for all Unicode-subset character sets). Converting Unicode to non-Unicode character sets will be lossless where possible, and will attempt to encode the name of the character in ASCII characters into the target character set.

Gack. No, I think this'd be a bad idea as the default behavior. What's right is up in the air -- I'm figuring we'll either throw an exception or substitute in a default character, but the full expansion's definitely way too much. -- Dan

--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Re: What Unicode means to us

Reply via email to