On Fri, Aug 01, 2003 at 01:47:59PM +0200, Celso González wrote: > On Fri, Aug 01, 2003 at 01:33:18PM +0200, Eduard Bloch wrote: > > #include <hallo.h> > > * Celso González [Fri, Aug 01 2003, 01:14:33PM]: > > > > > So when the uploader checks the name of the Maintainer in the control > > > file (not utf-8) with the name in the changelog says that are different, > > ^^^^^^^^^ > > > > Show me where policy tells you not to use UTF-8 in control files but > > some other non-ascii charset with some other encoding instead. > > Well, I think is not clear enough > A reflexion from the guy that suggest the change in policy > > Extracted from C2.2 > "Now, we can't switch to using UTF-8 for package control fields > and the like until dpkg has better support, but one thing we can > start doing today is requesting that Debian changelogs are UTF-8 > encoded" > > Maintainer is a control field > > The solution is that both files have the same encoding (both latin1 or > both utf-8) but i??m not sure that utf-8 is correct for debina/control
Right now, dpkg has no explicit support for anything other than ASCII in debian/control: that is to say, it doesn't attempt to recode maintainer names to the current locale when asked to display control information with 'dpkg -s', etc. However, it doesn't support Latin-1 any better than it supports UTF-8! If you think that this is a problem, then the answer is not to use Latin-1, but to use only ASCII. That said, the problems caused by using an encoding that dpkg doesn't support are not serious; they just mean that some people will see your name wrongly when they type 'dpkg -s', but hey, that would happen anyway (particularly if you don't like the ASCII transliteration of your name). With that knowledge, if you're going to pick a non-ASCII encoding, then UTF-8 is almost certainly the way to go. It seems unlikely to me that we would select anything other than UTF-8 as the 8-bit encoding for control files. As a side note, I'd love to get rid of Latin-1 in maintainer names; it's currently difficult for the BTS to declare any character set in the web pages it generates, since some of the maintainer names it prints are in Latin-1 and some in UTF-8 and it can't easily tell which are which. Switching to UTF-8 throughout would solve that problem. So, to summarize, please either use plain ASCII (if you think that the lack of recoding is a problem) or UTF-8 (if you don't mind); using other legacy encodings is just storing up trouble. Cheers, -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]