Marcus Brinkmann <[EMAIL PROTECTED]> writes: > UTF-8 is an insanely complex standard, if you start to look down its > depths.
UTF-8 is a complex standard. It is not insanely so. It is complex because it is representing a very complex problem. It is a standard computer programmer's disease to start talking about how much easier the world would be if every Latin character set had the same rules for capitalizing I, but they don't and it's the job of the computer to make both Turkish and French work. Complaining that this is hard is crazy; good grief, it's certainly no harder than a fancy VM system. [The reference is that Turkish has two letters I: I-with-dot and I-without-dot, in each case. So in French, you have lowercase I-with-dot which capitalizes to capital I-without-dot. But in Turkish, lowercase I-with-dot maps to capital I-with-dot.] So faced with a long history of computer programmers doing just enough to get by, pretending that language writing systems were simpler than they really are, the Unicode designers laudably set the goal of adapting to the world, rather than forcing the world to adapt to the damn computer. Thomas -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]