At 05:54 PM 9/7/2001 -0400, Bryan C. Warnock wrote:
>On Friday 07 September 2001 05:51 pm, Dan Sugalski wrote:
> > >(Like
> > >Unicode Everywhere).
> >
> > Who's doing that? We're keeping things in native format as much as we can.
>
>If one of our stated goals is Unicode support (even for the source itself -
>that's what I meant by "everywhere": source, input, output), we're going to
>be a little more hindered than if we didn't have to worry about it at all,
>no?

No. We don't want Unicode everywhere because:

*) Conversion to Unicode is sometimes lossy
*) Conversion back out of Unicode is sometimes lossy
*) Converting when we know how to work on the underlying string data is 
wasted cycles
*) Lots of folks using non-7-bit ASCII have perfectly adequate character 
sets with defined operations, so why should they have to use Unicode if 
they don't need it?

Unicode's sort of a greatest-common-multiple character set. We'll use it if 
we need to, but it's no panacea. (Unfortunately)

>Or will you only compare Granny Smiths with Granny Smiths?

If you compare, say, a Shift-JIS string to a Big5/traditional string, 
they'll probably both end up both converting to Unicode and the result 
compared. (Assuming that neither the Big5/traditional nor the Shift-JIS 
string library knows how to convert to the other losslessly) And a plain 
string comparison for gt/lt is less straightforward than you might think...

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to