At 1:04 PM +0100 11/14/04, Ron Blaschke wrote:
Thursday, November 11, 2004, 5:42:29 PM, Dan Sugalski wrote:
Or something like that.

[snip]

FWIW, I really like the idea.

Will there be a data type for "characters," or are those just strings
with a single grapheme?

Strings with a single grapheme. "Characters" can be multiple code points, so it's the only way to do it properly.


There is direct code point access, and those are 32-bit unsigned ints. We're going to frown on most uses of those, since it's a good way to find yourself behaving really badly in a number of cases.

Luckily Leo's last name'll make sure that we at least manage it properly in parrot. :)

As a side note, the Java people decided for UTF-16 Unicode "char"s,
and some good time getting Supplementary Characters (> U+FFFF) to
work.
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/

Yeah, that was something I didn't want to deal with. Unicode's got the largest range of code points, and it says 32 bits are enough. If it goes 64-bit at some point, well... Hopefully I'll be long-retired and not caring any more. :)
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to