Bruno Haible <[EMAIL PROTECTED]> writes: > It's a similar problem with "soupçon" or "©" as for proper names, but the > output and the handling is different. > - For a string like "He showed a soupçon of pride" you want to rely > entirely on the translator and drop the English original if a > translation is present.
Sure, but gettext doesn't handle that case well now, since I can't write this: gettext ("The author is François Pinard.") gettext ("He showed a soupçon of pride.") as msgids are supposed to be ASCII. In theory I should write the ASCII-only: gettext ("The author is Francois Pinard.") gettext ("He showed a soupcon of pride.") and have an English-translation .po file, but nobody does that. (I tried doing something like this with diffutils but it was a pain to maintain.) A solution that handles only proper names will lead to diagnostics that are somewhat stilted, since in some languages, the form of a proper name depends on its grammatical position in the sentence. In contrast, a solution that handles any text string (such as the two examples above) will lead to less-stilted output. > - For a proper name, one should show the English name beneath the > translation, because > 1. the person might not want to see his/her name distorted, > 2. so that the user can enter the name in search engines or similar. That's fine, but the tricky part is the "beneath the translation" business. Surely it's not always correct to put the translation in parentheses after the translated name -- parentheses count as "beneath" in some languages but surely not all. Also, how about programs that want to output the author's name in English but put the original name afterwards? This is common practice among Far East authors. E.g., suppose I want to generate this output in English:
This program was written by Yukihiro Matsumoto (にっき). and this output in Japanese: このプログラムはにっき<<Yukihiro Matsumoto>>によって書かれていた。 (Sorry, I don't know Japanese -- I just made this up and used Babelfish to come up with quasi-Japanese, but I hope you get the idea of the problem I'm thinking about.) How should I do this sort of thing with propername? > OK, I'll make the test stricter, testing whether the occurrence of > trim (name) inside the translation starts and ends at word boundaries. Thanks, that's better, but I am dubious of this stricter test as well, for the usual reasons. For starters, not every language has word boundaries.
> if you call > proper_name_utf8 ("Georgia O'Keefe", "Georgia O’Keefe") > the function will test whether it finds either of the two spellings among > the translation. This should also be sufficient to avoid an output > "<something or other> (Georgia O’Keefe) (Georgia O'Keefe)". Thanks for explaining this; it wasn't clear to me. But I'm still not clear on what proper_name_utf8 is supposed to mean. For example, suppose I want to output the proper name "Xi'an". Should I use the UTF-8 English name? proper_name_utf8 ("Xi'an", "Xī’ān") or should I use the UTF-8 Chinese name?
proper_name_utf8 ("Xi'an", "西安") (Here the UTF-8 English romanization uses tone marks to indicate Mandarin pronunciation, which is a bit pedantic, but even if you omit the tone marks you still have the problem with the apostrophe.)
It seems to me that I really need at least _three_ alternatives here: one for ASCII (Xi'an), one for English (Xī’ān), and one for the
native-language form (西安). > The person has a right to see his/her name spelled out correctly. Yes, no question of that! I'm just trying to understand the proposed module.