In <[email protected]>, on
03/26/2012
   at 07:27 AM, Lloyd Fuller <[email protected]> said:

>UNICODE does, but not necessarily subsets of the full UNICODE.  
>UTF-8 is a subset and UTF-16 is a subset.

No, they are transforms, capable of representing all defined Unicode
characters.

>So It does not surprise me that UTF-8 does not have the Hebrew
>alphabet.

It would surprise me if it were true, which it isn't.

>but to get the full set you need UTF-32.

Not even close. From RFC 3629:

   UTF-8 encodes UCS characters as a varying number of octets,
   where the number of octets, and the value of each, depend on the
   integer value assigned to the character in ISO/IEC 10646 (the
   character number, a.k.a. code position, code point or Unicode
   scalar value).  This encoding form has the following
   characteristics (all values are in hexadecimal):

 
-- 
     Shmuel (Seymour J.) Metz, SysProg and JOAT
     ISO position; see <http://patriot.net/~shmuel/resume/brief.html> 
We don't care. We don't have to care, we're Congress.
(S877: The Shut up and Eat Your spam act of 2003)

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to