I see. Does this mean that, if I expect to handle 32-bit characters, then I need to consider changing my character-handling functions to accept sequences of vectors instead?
Also, how does (seq "\ud800\udc00") work? Does it split the character into two 16-bit characters? In the REPL, it seems to return (\? \?). On Apr 26, 6:22 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote: > On Apr 26, 2009, at 7:47 PM, samppi wrote: > > > user=> \u10000 > > java.lang.IllegalArgumentException: Invalid unicode character: \u10000 > > > How would I embed the character as a literal in my Clojure code? > > Java characters are (still) 16 bits wide. A single Java character > cannot represent the Unicode character you're looking to represent. > Since Clojure characters are Java characters, you'll need to do this > the way the Java folks do. > > I found a blog post about it here: > > > http://weblogs.java.net/blog/joconner/archive/2004/04/unicode_40_supp... > > This is also a good reference: > > http://www.fileformat.info/info/unicode/char/10000/index.htm > > This representation as a string from that page does seem to work in > Clojure: > > "\ud800\udc00" > > --Steve > > smime.p7s > 3KViewDownload --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---