This is most probably a repl-terminal encoding mismatch issue.

Are you using JLine? It seems to be the case that it cannot handle UTF-8.

There is a simple way to see if there is encoding mismatch issues: just make
a seq of the string:

(seq t)
if everything is working correctly:
=> a seq of the 9 chinese characters
else:
=> a seq of 27 unkown character symbols

It could worth investigating how the clojure repl creates a Reader (which
reads *characters*) from the System.in InputStream (which reads *bytes*)
too.

BEGIN TECHNICAL DETAILS

A bit of background: That string of chinese characters consists of 9
*characters*. (Try to select the text. You will be able to select it in 9
discrete steps). These can be encoded to bytes with the UTF-8 encoding, in
which case the encoded string will consist of 27 *bytes*.

Terminals are not character oriented, but byte oriented. This means that
both the user side (keyboard and screen) and the application (the clojure
repl) needs to choose an encoding to use if they want to be working with
characters.

In this case, the terminal is probably configured to use UTF-8, as it is
able to both emit and display the chinese characters correctly. The 9
characters are then sent as 27 bytes to the clojure repl. If everything was
configured correctly, the repl should decode the 27 bytes into 9 characters
again.

Now, the repl was not configured correctly and probably used the OS default
encoding (which can be anything) that in this case must have been a
single-byte-encoding like Latin-1/ISO-8859-1 or Mac Roman. The 27 bytes were
then decoded into 27 characters.

If you evaluate t, the repl will encode the 27 characters into 27 bytes
again, send it to the terminal, which will decode them into the 9 chinese
caracters and display them.

END TECHNICAL DETAILS

// raek

2010/7/1 ngocdaothanh <ngocdaoth...@gmail.com>

> With 1.2-master-SNAPSHOT:
>
> (def t "車馬象士將士象馬車")
> (count t)  ; => 27
> (.length t)  ; => 27
>
> With 1.1, the result is 9.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to