Thursday, November 11, 2004, 5:42:29 PM, Dan Sugalski wrote:Or something like that.
[snip]
FWIW, I really like the idea.
Will there be a data type for "characters," or are those just strings with a single grapheme?
Strings with a single grapheme. "Characters" can be multiple code points, so it's the only way to do it properly.
There is direct code point access, and those are 32-bit unsigned ints. We're going to frown on most uses of those, since it's a good way to find yourself behaving really badly in a number of cases.
Luckily Leo's last name'll make sure that we at least manage it properly in parrot. :)
As a side note, the Java people decided for UTF-16 Unicode "char"s, and some good time getting Supplementary Characters (> U+FFFF) to work. http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
Yeah, that was something I didn't want to deal with. Unicode's got the largest range of code points, and it says 32 bits are enough. If it goes 64-bit at some point, well... Hopefully I'll be long-retired and not caring any more. :)
--
Dan
--------------------------------------it's like this------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk