Am 12.02.2012 23:07, schrieb Terry Reedy: > But because of the limitation of ascii on a worldwide, as opposed to > American basis, we ended up with 100-200 codings for almost as many > character sets. This is because the idea of ascii was applied by each > nation or language group individually to their local situation.
You really learn to appreciate unicode when you have to deal with mixed languages in texts and old databases from the 70ties and 80ties. I'm working with books that contain medieval German, old German, modern German, English, French, Latin, Hebrew, Arabic, ancient and modern Greek, Rhaeto-Romanic, East European and more languages. Sometimes three or four languages are used in a single book. Some books are more than 700 years old and contain glyphs that aren't covered by unicode yet. Without unicode it would be virtually impossible to deal with it. Metadata for these books come from old and proprietary databases and are stored in a format that is optimized for magnetic tape. Most people will never have heard about ISO-5426 or ANSEL encoding or about file formats like MAB2, MARC or PICA. It took me quite some time to develop codecs to encode and decode an old and partly undocumented variable multibyte encodings that predates UTF-8 by about a decade. Of course every system interprets the undocumented parts slightly different ... Unicode and XML are bliss for metadata exchange and long term storage! -- http://mail.python.org/mailman/listinfo/python-list