Hi Wolfgang,

Wolfgang Schweer schreef op ma 20-01-2025 om 20:25 [+0000]:
> Hi Frans,
> after taking a look at hosted weblate, I noticed that
> the Tamil translations are totally broken. Instead of using Weblate
> step by step, the translator seems to have used a programmatical tool
> to "translate" both PO files, resulting in lots of XML syntax errors.
> Besides tags, also proper names have been translated to Tamil. It's a
> complete mess.

Thanks for pointing out this problem. I emailed the last translator
asking them to check the strings with failing checks.

I think this is a good opportunity to look into the quality control
issue for hosted.weblate translations. That's why I decided to put 
debian-edu@lists.debian.org in Cc. I hope you don't mind.

Currently, debian-edu-doc does not have any explicit quality control in
place at hosted.weblate. Since anyone who wants to contribute to atranslation 
can freely do so, one could argue that this installs some implicit quality 
checking: existing translations can be reviewed by peer translators and 
improved by them.
One can argue in favor of the current situation by stating: better an
imperfect translation than no translation at all. This is more or less
in line with the current loose policy on a minimum translation
threshold: better a limited translation than no translation at all.

But perhaps this is not the best possible policy. The problem with the
Tamil translation invites us to think about it anyway.

Hosted.weblate offers the possibility to enforce peer review project
wide or per language. With such a workflow, anybody can add a
suggestion, which needs approval from additional members of the
translation team before it is accepted as a translation. Suggestions
become translations when given a predetermined number of votes.
Personally, I am not a big fan of such a method, because I know from
experience that it can be a blocker for small translation teams. The
Debian DDTP project uses such a method. For package description
translations into Dutch, this has led to a situation where a number of
draft translations remain blocked due to a lack of revisions and this
led to the side effect that no translators want to invest time and
energy anymore in a translation that may remain blocked forever.

Hosted.weblate automatically performs a number of checks, similar to
those built into some desktop translation tools. It groups the strings
with failed checks together under "Translated strings with any failing
checks", but also breaks them down by each individual failed check:
"Unchanged translation, Mismatched full stop, Mismatched colon,
Mismatched semicolon, Mismatched semicolon, XML syntax, XML markup, and
Consecutive duplicated words."
Not all of these issues are equally important from a translation
quality perspective, but failed XML checks certainly are. Perhaps we
should therefore set an acceptable maximum for such failed XML checks
to still allow translations into the main branch on salsa.

One could also try to reduce as much as possible the risk that such
errors occur by rewriting certain paragraphs of the manual so that
risky text could be even more filtered out by the get_manual script.
One could rewrite the paragraph "The source at <ulink url="
https://wiki.debian.org/DebianEdu/Documentation/Bookworm"/> is a wiki
and updated frequently. " in the following way:
"The source of this manual is a wiki and is updated frequently. It can
be found at:
 <ulink url="https://wiki.debian.org/DebianEdu/Documentation/Bookworm"/
>",
so that "<ulink url="
https://wiki.debian.org/DebianEdu/Documentation/Bookworm"/>" is on a
separate paragraph and could be easily filtered out by the script.

Comments are welcome.

-- 
Kind regards,
Frans Spiesschaert


> 
> Kind regards,
> Wolfgang
> (who can't do more)

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to