On 21/02/13 11:43, Helmut Grohne wrote:
> The number of exceptions is about 200 contained in about 50 binary
> packages.

Do you have a list handy?

What proportion of them are UTF-8? You can test via, for instance:

  echo "$filename" | isutf8 -q /dev/stdin || echo "not UTF-8: $filename"

with isutf8(1) from moreutils. In theory this could have
false-positives, but UTF-8's design makes it unlikely that meaningful
strings in ISO-8859-* happen to be syntactically valid UTF-8.

> In those packages some filenames are not representable as
> UTF-8 (for example aspell-is)

I assume you mean "are not UTF-8" (presumably they're ISO-8859-1 or
ISO-8859-15?) rather than "not representable"? (Any Latin1 string is
representable in UTF-8 via transcoding, although the resulting bytes
will obviously be different.)

> and others don't make any sense in
> ISO-8859-15 (for example ca-certificates).

These do appear to be UTF-8.

> to mandating a particular encoding (such as UTF-8).

I would personally be inclined to recommend/mandate UTF-8.

I certainly don't think any option other than ASCII, UTF-8 or "they're
just bytestrings, deal with it" would make sense - the third of those
options is what we have at the moment, and this bug is basically a
request to reject it.

Tools typically either assume that filenames are encoded according to
the current locale (traditional Unix behaviour, and GNOME with
G_BROKEN_FILENAMES set) or UTF-8 (probably many tools, but notably
encouraged by GNOME); and I believe Debian has defaulted to UTF-8
locales for quite some time, so the two often coincide.

Also, as far as I know, UTF-8 is the only widely-used encoding that can
represent all Unicode characters and is suitable for Unix filenames.
ISO-8859-* can't represent all characters; UTF-16 and UTF-32 are
unsuitable for Unix filenames because they don't coincide with ASCII
over the ASCII range; and UCS-2 manages to have both problems
simultaneously.

    S


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/51261b04.8030...@debian.org

Reply via email to