Josselin Mouette <[EMAIL PROTECTED]> writes: > Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit : >> I think most of us agree that non-UTF-8-characters are not a good idea >> (please note the UTF-8-characters is a superset of ASCII). For some >> places (like package names), I think most of us even agree that only >> ASCII-characters should be used. Also, there is the proposal that in >> other fields (i.e. names), an translation should (also) be used if the >> characters are not in some basic classes (more or less: ASCII plus >> ASCII-similar letters). >> >> So, I personally consider non-UTF-8-characters an bug, and >> UTF-8-not-ASCII on the way from bug to allowed. > > Many of us have names that can't be written using ASCII. Furthermore, > the Debian tools need consistency between the developer name in the > changelog and the Maintainer/Uploaders fields in the control file. The > only way for these developers to have a policy-compliant changelog > without having their uploads considered as NMUs is to encode the control > file in UTF-8. > -- > .''`. Josselin Mouette /\./\ > : :' : [EMAIL PROTECTED] > `. `' [EMAIL PROTECTED] > `- Debian GNU/Linux -- The power of freedom
Which means all control file, changelog file, changes file, Packages and Sources file parsing programs have to be truely converted to UTF-8. dpkg, apt, aptitude, dselect, apt-proxy, apt-cacher(?), debmirror, debpartial-mirror, DAK, cdebootstrap, ... I guess most just work out of luck with the mixture we have now. We already had cdebootstrap crashes because of it (its parser was a bit stricter than the rest). On that note, how likely is it to hit a UTF-8 character encoding that contains a '\n'? Any non UTF-8 aware parser would assume a new line has started and get parse errors. MfG Goswin