samppi a écrit : > I see. Does this mean that, if I expect to handle 32-bit characters, > then I need to consider changing my character-handling functions to > accept sequences of vectors instead? > > Also, how does (seq "\ud800\udc00") work? Does it split the character > into two 16-bit characters? In the REPL, it seems to return (\? \?). >
seq on a String returns a sequence of Java characters (16 bits values). (defn codepoints-seq [s] ; returns a seq of ints (let [s (str s) n (count s) f (fn this [i] (lazy-seq (when (< i n) (cons (.codePointAt s i) (this (.offsetByCodePoints s i 1))))))] (f 0))) ;; => (codepoint-seq "\ud800\udc00a\ud800\udd00") ;; (65536 97 65792) -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---