In <[email protected]>, on
03/26/2012
at 07:20 AM, Lloyd Fuller <[email protected]> said:
>Depending upon the characters used, some of the UTF-8 characters
>are really 16-bits.
For large values of 16. The Unicode -> UTF-8 mapping is
Char. number range | UTF-8 octet sequence
(hexadecimal) | (binary)
--------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
--
Shmuel (Seymour J.) Metz, SysProg and JOAT
ISO position; see <http://patriot.net/~shmuel/resume/brief.html>
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN