On Thu, Oct 21, 2004 at 09:22:10AM +0100, Edmund GRIMLEY EVANS wrote: > Christian Perrier <[EMAIL PROTECTED]>: > > > > debian/po/nb.po is a mixture of UTF-8 and ISO-8859-1 encodings, > > > and as a result accented letters are wrongly displayed. > > > Here is a patch. > > > > How did you check that? > > > > My usual check "msgfmt -o /dev/null -c --statistics <file>" did not > > show anything... > > > > I would be very interested in adding such check to the various PO file > > I handle here and there. > > Probably somebody already has something better, but here's something > that might work. Run it like this: [...] > # Check that we can convert from the claimed encoding. > open(P, "|iconv -f $enc -t utf-8 > /dev/null"); > print P $t || die; > close(P) || die; > > # In the case of iso-8859-X, look for dodgy high control chars. > if ($enc =~ /^iso-8859-/i) { > die if $t =~ /[\x80-\x9f]/; > }
My first try looked something like that, but I wanted to submit patches, and so I needed to know which strings are wrong in order to fix PO files. Msgexec is used for that purpose. People interested can have a look at http://people.debian.org/~barbier/check-po/checkfiles http://people.debian.org/~barbier/check-po/checkstring and other files found in this directory. (log.UTF-8.txt is a detailed report, bylang.txt and bypkg.txt are summaries sorted by language and package) I do not know what to do with errors listed on http://people.debian.org/~barbier/check-po/log.UTF-8.txt it would be really great if some translators could take care of their language, especially when charset is different from ISO-8859-1 and ISO-8859-15. On the other hand, some coordination would be nice so that a package gets bugged only once with a patch fixing all languages. The best option is certainly to send a message here if you are willing to fix bugs. Please note that there are many false positives, e.g. translators may use UTF-8 characters which have no equivalent in their legacy encoding, or because the reported message is not displayed (it can be either fuzzy or obsolete), so these errors have to be handled carefully. Generating this report takes about an hour, so this script will not be run periodically. As usual, check the BTS before working on a package to see if a bug has already been reported. I filed bugs yesterday against bsdmainutils, console-data, fonty, foomatic-filters and shadow packages. Special thanks to Graham Wilson who fixed bsdmainutils within a couple of hours, I was really impressed. Denis