In <[email protected]>, on
03/26/2012
at 07:27 AM, Lloyd Fuller <[email protected]> said:
>UNICODE does, but not necessarily subsets of the full UNICODE.
>UTF-8 is a subset and UTF-16 is a subset.
No, they are transforms, capable of representing all defined Unicode
characters.
>So It does not surprise me that UTF-8 does not have the Hebrew
>alphabet.
It would surprise me if it were true, which it isn't.
>but to get the full set you need UTF-32.
Not even close. From RFC 3629:
UTF-8 encodes UCS characters as a varying number of octets,
where the number of octets, and the value of each, depend on the
integer value assigned to the character in ISO/IEC 10646 (the
character number, a.k.a. code position, code point or Unicode
scalar value). This encoding form has the following
characteristics (all values are in hexadecimal):
--
Shmuel (Seymour J.) Metz, SysProg and JOAT
ISO position; see <http://patriot.net/~shmuel/resume/brief.html>
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN