On Sunday 13 January 2008, Christian Perrier wrote: > Quoting Frans Pop ([EMAIL PROTECTED]): > > For now I'm mostly interested in discussion of the issues I mention in > > the comments in the big new section (which basically replaces the old > > section that follows it). > > > + # We need the date of the last update of a sublevel PO file > > Yes. > > > + # Preferably we should also determine the name of the person who > > + # did the last update to a sublevel (for changelogs) > > That would certainly be better. I fear it could complicated the code > quite a lot and, indeed, the translation is mostly a team work. I > personnally don't give much importance to Last-Translator. > > So, well, if we find a *not too complicated* way to allow for > different last-translator, why not. But I don't think it's worth a > great effort.
With some generalized functions it wasn't "too complicated" :-) And I do think it's important to at least /try/ and get the correct translator into our changelogs. Are there gettext alternatives to the po_print_header() and po_print_body() functions? If there are, I think that would be preferred, but note that my functions select/remove both leading comments and all headers. > > + # When updating a sublevel PO file, we should really retain > > + # all the old headers and only update the POT-Creation-Date... > > Yes, definitely. There may be specific comments, or whatever Done. > > + # Do we really want to loose obsolete strings? > > + # Shouldn't that be up to the translator? > > + msgattrib --width=79 --no-obsolete > > sublevel${i}/${lang}.po.new > > >sublevel${i}/${lang}.po > > I should have put a comment when I added this. I know there was a > reason..:-| Some fancy footwork was needed, but it looks like I've got a working implementation for this. All obsolete strings are not gathered in the sublevel1 PO file. Attached a new version of the patch. I think this solves all issues I spotted with the original implementation. The current patch still also supports the current system. For the final version I would suggest removing that (in practice: remove the "old" Phase III code). I'd also suggest to remove the --split option and instead just hardcode the number of levels in a variable in the script. I've done a fair amount of testing and the results looks good to me. I would suggest delaying implementation of the patch until after the Beta1 release, but it would be great if you could test this a bit too. The way I have tested this is: $ cd <d-i dir> # Make sure there are no pending changes! $ for i in 1 2 3 4 5; do mkdir packages/po/sublevel$i; done # Prepare for conversion (repeat the following 3 commands to revert # to the initial state): $ svn revert -R packages/ $ cp packages/po/*.po packages/po/sublevel1/ $ rm -f packages/po/sublevel[2345]/* # Initial conversion run: $ <path>/l10n-sync --noupdatepo --force --split=5 --convert `pwd` # Do translation updates etc, then do a "normal" run: $ <path>/l10n-sync --noupdatepo --force --split=5 `pwd` # Clean up after testing: $ svn revert -R packages/ $ rm -f packages/po/sublevel*/* Cheers, FJP
commit 8b85f434878442c8307b8c5bb14baf7105b5393e Author: Frans Pop <[EMAIL PROTECTED]> Date: Sat Jan 12 00:53:59 2008 +0100 Improve multi-level handling Main change is an improved method for updating from sublevel PO files. Characteristics of the new method: - preserve the headers in sublevel PO files (only POT-Creation-Date is updated) - the PO-Revision-Date and Last-Translator for the most recently updated sublevel PO file are used to update PO files for individual packages - if the level of a string changes, the existing translation from the old level is preserved - obsolete strings are preserved in the sublevel1 PO file in case strings are reintroduced later (translators should remove them occasionally) A temporary '--convert' option has been added to facilitate conversion from the current master PO files to multi-level PO files. Other changes - Introduce new functions to determine header values for most recent updated PO or POT file. - Remove custom PO file headers (X-*) from merged PO files before updating translations in packages directories to avoid cluttering them. diff --git a/scripts/l10n/l10n-sync b/scripts/l10n/l10n-sync index 9601588..0e573f0 100755 --- a/scripts/l10n/l10n-sync +++ b/scripts/l10n/l10n-sync @@ -18,6 +18,7 @@ NUMLEVELS=1 UPDATEPO=Y SYNCPKGS=Y QUIET=N +CONVERT=N svn=svn debconfupdatepo=debconf-updatepo @@ -145,6 +146,73 @@ criticalerr() { exit 3 } + +po_last_updated() { + local key files file lastfile lastdate tdate + key=$1 + shift + files="$*" + + lastdate=0 + for file in $files ; do + tdate=$(date -d "$(grep "^\"$key:" $file | \ + sed 's/^.*: \(.*\)\\n.*$/\1/')" "+%s") + if [ $tdate -gt $lastdate ] ; then + lastdate=$tdate + lastfile=$file + fi + done + echo "$lastfile" +} + +# Get the whole line +# The --no-wrap is needed because translator can span more than one line +# The last sed statement is needed to preserve the \n at the end +po_get_header() { + local key=$1 + local file=$2 + msgattrib --no-wrap $file | grep "^\"$key:" | sed 's/^.*: \(.*\)\\n.*$/\1/' +} + +# Replace a header with a new value +# The complex sed expression is to allow for the fact that a header may span +# two lines; spanning three lines is not supported +po_replace_header() { + local key=$1 + local value=$2 + local file=$3 + sed -i "/^\"$key:/ N; s/^\"$key.*\\\\n\"\(\n.*\|$\)/\"$key: $value\\\\n\"\1/" \ + $file +} + +# Print anything up to the first msgid (the header) +po_print_header() { + awk 'BEGIN {found = 0} + /^msgid ""/ {found = 1} + /^$/ {if (found == 1) exit} + {print $0}' $1 +} + +# Print anything after the first msgid (the header) +po_print_body() { + awk 'BEGIN {found = 0} + /^msgid ""/ {if (found == 0) found = 1} + /^$/ {if (found == 1) found = 2} + {if (found == 2) print $0}' $1 +} + +# Print obsolete strings +po_print_obsolete() { +# # Old "manual" version +# awk 'BEGIN {found = 0; lead=""} +# /^#~ msgid/ {if (found == 0) {found = 1; print lead}} +# {if (found == 0) lead=lead"\n"$0} +# /^$/ {if (found == 0) lead=""} +# {if (found == 1) print $0}' $1 + + msgattrib --only-obsolete --width=79 $1 | po_print_body +} + ## Command line parsing MORETODO=true while $MORETODO ; do @@ -189,6 +257,9 @@ while $MORETODO ; do "--nolog") LOG="" ;; + "--convert") + CONVERT=Y + ;; "--"*) echo "Illegal option: $1" >&2 usage @@ -398,21 +469,19 @@ log "- Merge all package templates.pot files..." if ! msgcat ${pots} >/dev/null 2>&1 ; then svnerr fi -log_cmd --pass msgcat ${pots} | \ +log_cmd --pass msgcat $pots | \ sed 's/charset=CHARSET/charset=UTF-8/g' >$DI_COPY/packages/po/template.pot.new # Determine the most recent POT-Creation-Date for individual components # Include master templates.pot too so the timestamp will never be set back -LASTDATE="$( - for j in ${pots} po/template.pot; do - date -ud "$(grep "POT-Creation-Date:" $j | sed 's/^.*: \(.*\)\\n.*$/\1/')" "+%F %R%z" - done | sort | tail -n 1)" +LASTDATE="$(po_get_header "POT-Creation-Date" \ + $(po_last_updated "POT-Creation-Date" $pots po/template.pot))" + # We don't want all templates.pot files headers as we don't care about them # So we merge the generated file with a simple header.pot file if [ -f po/header.pot -a -s po/template.pot.new ] ; then msgcat --use-first po/header.pot po/template.pot.new | \ - sed 's/charset=UTF-8/charset=CHARSET/g' | \ - sed "s/^.*POT-Creation-Date:.*$/\"POT-Creation-Date: $LASTDATE\\\n\"/" \ - > po/template.pot + sed 's/charset=UTF-8/charset=CHARSET/g' > po/template.pot + po_replace_header "POT-Creation-Date" "$LASTDATE" po/template.pot rm po/template.pot.new else error "ERROR: no $DI_COPY/packages/po/header.pot file. Cannot continue." @@ -465,14 +534,119 @@ if [ "$WITHLEVELS" = "Y" ] ; then fi log "" +# Update PO files for sublevels: +# 3a) Synchronize with D-I SVN +# 3b) Merge the sublevel PO files into a master PO file +# 3c) Update the master PO file from the master POT file as it will be used +# to update package PO files +# 3d) Update the sublevel PO files from this master PO file and the sublevel POT file +# 3e) commit back the changed file +log "Phase III: update master translation files" +if [ "$WITHLEVELS" = "Y" ] ; then + cd $DI_COPY/packages/po + languages="" + for po in sublevel1/*.po ; do + lang=$(basename $po .po) + # Next line is just for quicker testing + #[ $lang = nl ] || continue + log "- $lang" + if [ ! -r PROSPECTIVE ] || \ + ([ -r PROSPECTIVE ] && \ + ! grep -q "^$lang[[:space:]]*$" PROSPECTIVE); then + languages="${languages:+$languages }$lang" + fi + + log " - Merge sublevel PO files into master PO file and update..." + list="" + for i in $LEVELS; do + if [ -f sublevel$i/$lang.po ]; then + list="${list:+$list }sublevel$i/$lang.po" + fi + done + # Retain the date and translator of the last updated sublevel PO file + LASTFILE="$(po_last_updated "PO-Revision-Date" $list)" + LASTDATE="$(po_get_header "PO-Revision-Date" $LASTFILE)" + LASTTRANS="$(po_get_header "Last-Translator" $LASTFILE)" + msgcat --use-first $list >${lang}.po + po_replace_header "PO-Revision-Date" "$LASTDATE" $lang.po + po_replace_header "Last-Translator" "$LASTTRANS" $lang.po + + # Update the master PO file (as it's used to update package PO files) + log_cmd --pass msgmerge --previous $lang.po template.pot >$lang.po.new || \ + gettexterr + + # Remember obsolete strings + OBSOLETE="$(po_print_obsolete $lang.po.new)" + + # Optionally merge with PO files from a different source + # Strings from the other source are preferred! + # Should we disallow automatic commits for this? + # WARNING: NOT TESTED!!! + if [ -n "$MERGEDIR" ] && [ -r $MERGEDIR/$lang.po ]; then + log " - Merge with $MERGEDIR/$lang.po !!" + msgcat --use-first "$MERGEDIR/$lang.po" $lang.po.new \ + >$lang.po.merge || gettexterr + log_cmd --pass msgmerge --previous $lang.po.merge template.pot | \ + msgattrib --no-obsolete >$lang.po.new || gettexterr + rm $lang.po.merge + fi + + # Clean up new master PO file + msgattrib --width=79 --no-obsolete $lang.po.new >$lang.po + rm $lang.po.new + + # Update the sublevel PO files + # We keep its old header and only update the POT-Creation-Date + for i in $LEVELS; do + if [ -f sublevel$i/$lang.po ]; then + OLDHEADER="$(po_print_header sublevel$i/$lang.po)" + elif [ "$CONVERT" = Y ]; then + OLDHEADER="$(po_print_header $lang.po)" + fi + if [ -f sublevel$i/$lang.po ] || [ "$CONVERT" = Y ]; then + log_cmd --pass -m " - Merge with template.pot for sublevel $i..." \ + msgmerge --previous $lang.po \ + sublevel$i/template.pot \ + >sublevel$i/$lang.po.new || gettexterr + POTDATE="$(po_get_header "POT-Creation-Date" sublevel$i/$lang.po.new)" + + # Combine old header and new content + ( echo "$OLDHEADER" + po_print_body sublevel$i/$lang.po.new ) | \ + msgattrib --width=79 --no-obsolete \ + >sublevel$i/$lang.po + po_replace_header "POT-Creation-Date" "$POTDATE" sublevel$i/$lang.po + # Append any obsolete strings to sublevel1 PO file + if [ $i -eq 1 ] && [ "$OBSOLETE" ]; then + echo "$OBSOLETE" >>sublevel$i/$lang.po + fi + rm sublevel$i/$lang.po.new + fi + done + + # Remove all custom headers so they don't clutter the PO files in + # the packages directories + msgattrib --no-wrap $lang.po | \ + grep -v "^\"X-.*: .*\\n\"$" | \ + msgattrib --width=79 >$lang.po.new + mv $lang.po.new $lang.po + done + + if [ "$COMMIT" = "Y" ] ; then + log_cmd -p "Commit all general PO/POT files to SVN..." \ + $svn commit -m "$COMMIT_MARKER Updated packages/po/* against package templates" || svnerr + fi +fi + # For each PO file in packages/po/sublevel* or packages/po: # 3a) Synchronize with D-I SVN # 3b) Update with template.pot # 3c) Grab translations from the lower levels file(s) # 3d) commit back the changed file -log "Phase III: update master translation files" for i in $LEVELS; do if [ "$WITHLEVELS" = "Y" ] ; then + # Bail out; work has already been done in previous section + break dir=po/sublevel$i level="level $i " else @@ -540,26 +714,9 @@ for i in $LEVELS; do $svn commit -m"${COMMIT_MARKER} Updated packages/$dir/* with general template.pot" *.po template.pot || svnerr fi done - -# If we use levels, create a temporary general file -# (which we won't commit) to make merging in individual packages -# much faster -if [ "$WITHLEVELS" = "Y" ] ; then - cd $DI_COPY/packages/po - for po in sublevel1/*.po ; do - lang=$(basename $po .po) - list="" - for i in `seq $NUMLEVELS -1 1`; do - if [ -f sublevel${i}/${lang}.po ]; then - list="$list sublevel${i}/${lang}.po" - fi - done - msgcat --use-first $list >${lang}.po - done -fi log "" # Loop over D-I packages: @@ -578,10 +735,10 @@ if [ "$SYNCPKGS" = "Y" ]; then for lang in $languages ; do logn "$lang " cat >$lang.po.new <<EOF -# THIS FILE IS AUTOMATICALLY GENERATED FROM THE MASTER FILE: -# packages/po/$lang.po +# THIS FILE IS GENERATED AUTOMATICALLY FROM THE D-I PO MASTER FILES +# The master files can be found under packages/po/ # -# DO NOT MODIFY IT DIRECTLY: SUCH CHANGES WILL BE LOST +# DO NOT MODIFY THIS FILE DIRECTLY: SUCH CHANGES WILL BE LOST # EOF log_cmd --pass msgmerge $DI_COPY/packages/po/$lang.po templates.pot | \ @@ -599,22 +756,21 @@ EOF egrep -v "$filter" $lang.po >$oldfiltered egrep -v "$filter" $lang.po.new >$newfiltered if [ -z "$(diff $oldfiltered $newfiltered)" ] ; then - # Don't commit if the only chages are in filtered lines + # Don't commit if the only changes are in filtered lines rm $lang.po.new else + # Remember original PO-Revision-Date + LASTDATE="$(po_get_header "PO-Revision-Date" $lang.po)" + mv $lang.po.new $lang.po # At least one unfiltered line changed # Put the old Revision-Date back if asked for - if [ "$KEEP_REVISION" != "N" ] && [ "$KEEP_REVISION" = "$lang" ] ; then - # Grab back the PO-Revision-Date from the old file - old_revision=`grep -e "^\"PO-Revision-Date:" $lang.po | sed 's/\\\\n\"//g'` - # And replace the one from the new file with it - # then put all this as a result - sed "s/\"PO-Revision-Date:.*/$old_revision\\\\n\"/g" $lang.po.new >$lang.po - rm $lang.po.new - log_s1 "${package}/debian/po/${lang}.po" "CHANGED, revision kept" + if [ "$KEEP_REVISION" != "N" ] && \ + [ "$KEEP_REVISION" = "$lang" ] ; then + # Restore original PO-Revision-Date + po_replace_header "PO-Revision-Date" "$LASTDATE" $lang.po + log_s1 "$package/debian/po/$lang.po" "CHANGED, revision kept" else - mv $lang.po.new $lang.po - log_s1 "${package}/debian/po/${lang}.po" "CHANGED" + log_s1 "$package/debian/po/$lang.po" "CHANGED" fi fi # Remove temporary files
signature.asc
Description: This is a digitally signed message part.