On Mon, Feb 05, 2001 at 11:24:55AM +0100, Zhu Ming wrote:
> Maybe you should not use character set "UTF-8". I remember
> that it's 8-bit Unicode. As I know, Chinese and Korean has
> 16-bit code. So at least, you should try 16-bit Unicode.
> I forgot the name, maybe it's "UTF-16". But I'm not sure if
> JDK have fully support to "UTF-16".
UTF-8 is an encoding that allows the multibyte (16 and higher)
Unicode code points to be encoded in 8 bits, not limited to 8 bits.
If a byte has its high order bit set then you know that the next
few bytes are also part of that particular code-point. So UTF-8
also handles the entire Unicode set. XML itself defaults to UTF-8
so its something that _should_ work 'out of the box'...
-MM
P.S. I've also posted this problem to HotDispatch so if you
can help me solve the problem you could get $50... ;-)
--
--------------------------------------------------------------------------------
Michael Mealling | Vote Libertarian! | www.rwhois.net/michael
Sr. Research Engineer | www.ga.lp.org/gwinnett | ICQ#: 14198821
Network Solutions | www.lp.org | [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]