On Wed, Mar 06, 2013 at 01:45:14PM +0900, Charles Plessy wrote: > Le Sat, Mar 02, 2013 at 04:38:49PM +0100, Guillem Jover a écrit : > > > > I'd second something like this, but I'd first like us to consider if > > we really want any non-ASCII characters in filenames. Currently on sid > > there does not appear to be many such filenames (64 from my check, if > > that's not bogus): > > > > $ LC_ALL=C zgrep '[^[:print:]]' \ > > ftp.debian.org_debian_dists_sid_*_Contents-amd64.gz | wc -l > > Hi Guillem and everybody, > > I had a closer look at these files. > > * There are dictionaries where the filename is the native name of the > language, like català, español, bokmål, etc. In all the case the > characters are valid Unicode. I think that it would be fair to allow > such cases.
This is not the current practice: In /usr/share/dict/ and /usr/lib/ispell/, only bokmål is 8bit. Most dictionnary names are in English, with sometime an alias in the language (catala, dansk, foeroyskt, bokmål, svenska). In /usr/lib/aspell/, most dictionnary are named using the ISO-639 2-letter code or the english name. There are some non-english aliases like francais.alias, which is missing the cedilla. Only català, español and íslenska are not 8bit. So currently, there is no standard practice to name dictionnaries after the UTF-8 encoding of the native spelling for the language, and it would be more practical to standardize on ISO 639 language code instead. > * There are names that look rather arbitrary and replaceable > with ASCII alternatives if needed. For instance in python-pyramid, > usr/lib/python2.6/dist-packages/pyramid/tests/fixtures/static/héhé.html Probably some test files that could be removed form the binary packages. > * There are CA certificates with names like Certinomis_-_Autorité_Racine.crt. > Since I do not know how these certificates work, I do not know if they > can be renamed. The main reason they have such name is to avoid name clash with other .crt file. > * There is a file that need to be in non-ASCII Unicode to fit its purpose: > usr/share/doc/console-tools/examples/♪♬ in console-tools. The package > also distributes a file called README.strange-name in the same directory. The value of such file is pretty low. > * There are some more dubious names like 6Sze¶æ_Jab³ek.png in lletters-media, > or Miroir_Sphérique in optgeo. However, they do not cause much > inconvenience > with a Unicode locale. Miroir_Sphe♦rique is a bug in itself: it should be Miroir_Sphérique. '6Sze¶æ_Jab³ek.png' is probably misencoded (it is intended to be 6 in Polish, i.e. sześć). Cheers, -- Bill. <ballo...@debian.org> Imagine a large red swirl here. -- To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130306231213.GD32005@yellowpig