Christian Perrier <[EMAIL PROTECTED]>: > > debian/po/nb.po is a mixture of UTF-8 and ISO-8859-1 encodings, > > and as a result accented letters are wrongly displayed. > > Here is a patch. > > How did you check that ? > > My usual check "msgfmt -o /dev/null -c --statistics <file>" did not > show anything... > > I would be very interested in adding such check to the various PO file > I handle here and there.
Probably somebody already has something better, but here's something that might work. Run it like this: LC_ALL=C ./check_po_enc iso_3166-0.41.eo.po No doubt some appropriate one-line addition to the code would make it work in any locale, but I am in a state of perpetual confusion about how Perl handles encodings. #!/usr/bin/perl unless ($#ARGV == 0) { print STDERR <<END Usage: check_po_enc PO_FILE Detects some obvious encoding errors, such as a mixture of iso-8859-X and UTF-8. YOU MUST RUN THIS IN A C LOCALE! END ; exit 1; } open(D, "<$ARGV[0]") || die; $t = join("", <D>); close(D); # Discover the claimed encoding. $t =~ /charset=([a-zA-Z0-9-]+)/ || die; $enc = $1; # Check that we can convert from the claimed encoding. open(P, "|iconv -f $enc -t utf-8 > /dev/null"); print P $t || die; close(P) || die; # In the case of iso-8859-X, look for dodgy high control chars. if ($enc =~ /^iso-8859-/i) { die if $t =~ /[\x80-\x9f]/; }