On Sun, Nov 19, 2000 at 10:50:54PM +0100, Bernd Eckenfels wrote: > On Sat, Nov 18, 2000 at 08:01:11PM -0600, David Starner wrote: > > Which includes the Chinese and Japenese, who need the characters found > > in the Supplementary Ideographic Planes, which means 4 byte characters. > > Afaik UTF8 is not able to encode 32bit unicode?
No, UTF-8 can encode 32-bit Unicode. UTF-16 can only encode 21.1-bit Unicode, so Unicode got chopped off there, but everyone rounds that up to 4 bytes. > I thought this is because > the "living" languages are all restricted to 16bit? Hmm... i might be wrong. > Does that mean Java does not support asian languages with its 16bit Unicode? The major 'living' languages are in the Basic Multilingual Plane, which is 16-bit Unicode. Japanese and Chinese are supported by characters in the BMP as well as any pre-1995 CJK standard, but they are in the process of standardizing ~50,000 ideographs for Chinese and Japanese outside the BMP. They're mostly very rare characters, but it's really nessecary for full support of CJK in Unicode. Java doesn't currently support them, but plans to when they actually get added to the standard. -- David Starner - [EMAIL PROTECTED] http://dvdeug.dhis.org Looking for a Debian developer in the Stillwater, Oklahoma area to sign my GPG key
pgpbUOCSu7BvD.pgp
Description: PGP signature