Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions.
Quoting some suggestions from Mark earlier in the thread: 1) Fix some of the formatting errors with GTTK. Would this really be so difficult? It seems to me that the breaking of links is a bug that needs fixing by Google. We're working on various formatting errors based on our conversations with members of the Tamil and Telugu Wikipedia. We're hoping to push those out soon (in the coming weeks). 2) Implement spelling and punctuation check automatically within GTTK before posting of the articles. There is spell check in Translator Toolkit, although it's not available for all languages. We don't have any punctuation checks today and I doubt that we can release this anytime soon. (If it's not available in Google Docs or Gmail, then it's unlikely that we'll have it for Translator Toolkit, as well, since we use the same infrastructure.) What's the proposal, though - would you like for us to prevent publishing of articles if they have too many spelling errors, or simply warn the user that there are X spelling errors? Any input you can provide on preferred behavior would be great. 3) Have GTTK automatically remove broken templates and images, or require users to translate any templates before a page may be posted. Templates are a bit tricky. Sometimes, a template in one Wikipedia does not exist in another Wikipedia. Other times, a template in one langauge maps to a template in another language but the parameters are different. Removing broken templates automatically may not work because some templates come between words. If we remove them, the sentences or paragraph may become invalid. We've also considered creating a custom interface for localizing templates, but this requires a lot of work. In the interim, the approach we've taken is to have translators fix the templates in Wikipedia when they post the article from Translator Toolkit. When a user clicks on Share > Publish to source page in Translator Toolkit, the Wikipedia article is in preview mode --- it's not live. The idea is that if there are any errors, the translator can fix them before saving the article. 4) Include a list of most needed articles for people to create, rather than random articles that will be of little use to local readers. Some articles, such as those on local topics, have the added benefit of encouraging more edits and community participation since they tend to generate more interest from speakers of a language in my experience. The articles we selected actually weren't really random. Here's how we selected them: 1. we looked at the top Google searches in the region (e.g., for Tamil, we looked at searches in India and I believe Sri Lanka, as well) 2. from the top Google searches in the region, we looked at the top, clicked Wikipedia articles --- regardless of the language (so we wound up with Wikipedia source articles in English, Hindi, and other languages) 3. from the top, clicked Wikipedia articles, we looked for articles that were either stubs or unavailable in the local language - these are the articles that we sent for translation This selection isn't perfect. For example, it assumes that the top, clicked Wikipedia articles by all users in India/Sri Lanka --- who may be searching in English, Hindi, Tamil, or some other language --- are relevant to the Tamil community. To improve this, last month, we met with members of the Tamil and Telugu Wikipedias to improve this article selection. The main changes that we agreed on were: 1. the local Wikipedia community should give Google final OK on what articles should or should not be translated 2. the local Wikipedia community add articles to Google's list 3. the local Wikipedia community can suggest titles for the articles 4. Google's translators will post the articles with their user names, and they will monitor community feedback on their user pages until the translation meets the community's standards We're just getting started on this new process, and we'll keep refining this with the Tamil and Telugu communities as we move forward. If it's successful, we'll use it as our template for other projects. As always, any feedback or suggestions are welcome. Also, while I plan to look at this foundation lists periodically, if you have bugs, you can also file them to our bug queue: translator-toolkit-support at google.com. While the eng team may not monitor this list, they do look at the support queue. Regards, Mike On Wed, Aug 4, 2010 at 5:17 PM, Federico Leva (Nemo) <nemow...@gmail.com>wrote: > Aphaia, 27/07/2010 21:33: > > I've noticed many of English Wikipedia articles cite only English > > written articles even if the topics are of non-English world. And > > normally, specially in the developing world, the most comprehend > > sources are found in their own languages - how can those articles be > > assured in NPOV when they ignore the majority of reliable sources? > > It's not only a matter of NPOV. There's even a policy for this: > http://en.wikipedia.org/wiki/Wikipedia:Verifiability#Non-English_sources > Obviously you can expect other language version to want the same for > their language. > > Nemo > > _______________________________________________ > foundation-l mailing list > foundation-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l > _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l