On Thu, 2011-03-31 at 15:58 +0200, Bill Allombert wrote:
> Dear Developpers,
> 
> there are a small numbers of packages that ship files with non-7bit 
> characters in filenames.
> $ apt-file search -l -x '[\x80-\xff]'
> 
> aspell-ca
> aspell-es
> aspell-is
> canorus
> console-tools
> dvb-apps
> ggz-python-games
> inorwegian
> jpilot
> lletters-media
> otrs2
> wnorwegian
> 
> So this raises two issues:
> 1) should non-7bit characters in filenames be allowed
> 2) if yes whould we require the filename to be in a correct UTF-8 encoding ?
> 
> I raise the question because I was trying to filter out popcon reports that 
> include
> non-7bit characters since it usually implies corruption of data, but this 
> might not be the
> case.
> 
> Also, it seems there is a tool out there that generate .deb packages with 
> names like
> designkit.702840f10216893fc3494b731e825b33666733d6.1 
> and filename that are all non-7bit. (probably in Japanese).

I think we should definitely *not* forbid this, and we should (at the
very least) be working towards supporting the practice.

It may be that we can't properly support this until we can guarantee a
C.UTF-8 locale as a minimum available standard, but that sounds to me
like another justification for such a locale.

I think we should encourage the filename to be in a UTF-8 encoding, and
even if upstream does use 8-bit filenames with a non-UTF-8 encoding I
think that a Debian packager should be encouraged to patch that.

I would also be OK with mandating that filenames should only be in
either UTF-8 or the ASCII subset thereof, and that ISO-8859-* and other
such restricted measures are not welcome on our filesystems.

Regards,
                                        Andrew McMillan.
-- 
------------------------------------------------------------------------
andrew (AT) morphoss (DOT) com                            +64(272)DEBIAN
         If wishes were horses, then beggars would be thieves.
------------------------------------------------------------------------

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to