On Sat, Jun 07, 2003 at 04:59:29PM +0300, Dmitry Borodaenko wrote: > On Thu, Jun 05, 2003 at 08:57:06PM -0400, Colin Walters wrote: > JR>> the only thing that will change is that if someone complains at > JR>> people who use UTF-8 in changelogs, a new retort will be > JR>> available, "THE POLICY MADE ME DO IT!!1!", or similar. > CW> Why would someone complain? > > I would complain. > > I am using KOI8-R terminal which can not display Latin-1 characters,
Where did Latin-1 come into this? > and it seems backward to me to mandate or even allow _usage_ of UTF-8 > ahead of getting it _supported_ across the system. If you find yourself with a UTF-8 file, use a program which knows how to recode on the fly to your native encoding. Such programs are increasingly common. What do you lose here? Those who have fonts that can display the character in question will be able to do so; those who don't won't, but will see some reasonably obvious indicator like a "?" or a filled-in square to show that the character is one they can't display. This is superior to the situation where those who don't have such fonts just see some gibberish. > I'd rather have 7-bit ASCII changelogs: why Latin-1 users are > privileged to use native spelling of their names, while Cyrillic and > Kanji and other users have to resort to transliteration? They aren't so privileged. They may decide to do it anyway, but since the encoding of changelogs is not yet specified you currently take pot luck on anything outside 7-bit ASCII. I believe you've just contradicted yourself, anyway. Nobody wants to have to transliterate their name. I don't want to have to transliterate the names of people who help me with my packages when I credit them in the changelog; in some cases I may not even know how to transliterate their names correctly. UTF-8 allows me to spell their names correctly. At worst, a couple of characters may not be displayed properly for people using legacy encodings who don't have software that can recode for them, but if I'd artificially transliterated to 7-bit ASCII then nobody would get to see the correct spellings anyway. Since UTF-8 includes ASCII, all the technical content of my changelogs will still appear normally no matter what locale you're using, but suddenly it becomes possible for me to credit my contributors properly regardless of whether they come from Spain, Russia, or Japan. We're not talking about mandating the use of UTF-8 across the whole system here. We're talking about recommending its use in one particular case where it gives a small but real benefit, and where the consequences of getting it wrong are not very important (we can always go back and recode a few changelogs if some unforeseen badness results). Think of it as a safe experiment in advance of wider deployment of UTF-8 later on. Package maintainers who aren't set up for writing UTF-8 can always resort to transliteration into ASCII if need be. -- Colin Watson [EMAIL PROTECTED]