Re: [Foundation-l] Push translation

2010-08-17 Thread stevertigo
Michael Galvez wrote: > Translator Toolkit processes generic Media Wiki text, although this is not > an officially supported feature and is largely untested.  If you upload a > UTF-8 file with extension ".mediawiki", Translator Toolkit will try to > render the file in the same way that it renders

Re: [Foundation-l] Push translation

2010-08-17 Thread David Gerard
On 17 August 2010 03:22, Samuel Klein wrote: > On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez wrote: >> On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote: >>> And the data that GTTK gathers from its use in Wikipedia translations? >>> What would need to happen for that to start coming back, in

Re: [Foundation-l] Push translation

2010-08-16 Thread Samuel Klein
On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez wrote: > On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote: >> >> And the data that GTTK gathers from its use in Wikipedia translations? >> What would need to happen for that to start coming back, in a usable >> form? > > The translated segments ar

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 11:30 PM, Lars Aronsson wrote: > On 08/06/2010 07:47 PM, Michael Galvez wrote: > > 3. We acquire dictionaries on limited licenses from other parties. In > > general, while we can surface this content on our own sites (e.g., Google > > Translate, Google Dictionary, Google T

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Thu, Aug 5, 2010 at 9:45 PM, stevertigo wrote: > Michael Galvez wrote: > > Sorry for coming into this discussion a bit late. I'm one of the members > of > > Google's translation team, and I wanted to make myself available for > > feedback/questions. > > Thanks for stopping by. A few question

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 4:52 AM, Federico Leva (Nemo) wrote: > Michael Galvez, 05/08/2010 15:12: > > Sorry for coming into this discussion a bit late. I'm one of the members > of > > Google's translation team, and I wanted to make myself available for > > feedback/questions. > > Thank you, you've

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Sat, Aug 7, 2010 at 1:38 AM, Mark Williamson wrote: > > On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson > wrote: > > > >> > 2) Implement spelling and punctuation check automatically within GTTK > >> before > >> > posting of the articles. > >> > > >> > There is spell check in Translator Toolki

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote: > On 6 August 2010 18:47, Michael Galvez wrote: > > > 3. We acquire dictionaries on limited licenses from other parties. In > > general, while we can surface this content on our own sites (e.g., Google > > Translate, Google Dictionary, Google

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
Hi Amir, Apologies for the late reply. Replies inline below. Mike On Fri, Aug 6, 2010 at 3:14 PM, Amir E. Aharoni < amir.ahar...@mail.huji.ac.il> wrote: > Dear Michael, I also thank you for joining the discussion. See my > question below. > > 2010/8/6 Michael Galvez : > >> Also, as far as Ind

Re: [Foundation-l] Push translation

2010-08-13 Thread Michael Galvez
On Fri, Aug 6, 2010 at 2:13 PM, Michael Snow wrote: > Michael Galvez wrote: > > 2. Once the articles exist in multiple languages, the articles take on a > > life of their own and become out of sync. If Wikipedians want to keep > those > > articles in sync, we would like to help them by enabling

Re: [Foundation-l] Push translation

2010-08-07 Thread Lars Aronsson
On 08/06/2010 07:47 PM, Michael Galvez wrote: > 3. We acquire dictionaries on limited licenses from other parties. In > general, while we can surface this content on our own sites (e.g., Google > Translate, Google Dictionary, Google Translator Toolkit), we don't have > permission to donate that da

Re: [Foundation-l] Push translation

2010-08-07 Thread stevertigo
Michael Galvez wrote: > Sorry for coming into this discussion a bit late.  I'm one of the members of > Google's translation team, and I wanted to make myself available for > feedback/questions. Thanks for stopping by. A few questions: 1) Does GTTK have a specific API for Mediawiki markup ("wikite

Re: [Foundation-l] Push translation

2010-08-07 Thread Federico Leva (Nemo)
Michael Galvez, 05/08/2010 15:12: > Sorry for coming into this discussion a bit late. I'm one of the members of > Google's translation team, and I wanted to make myself available for > feedback/questions. Thank you, you've explained some important things. > There is spell check in Translator Too

Re: [Foundation-l] Push translation

2010-08-07 Thread Federico Leva (Nemo)
Andreas Kolbe, 07/08/2010 02:23: > If Google want to build up their translation memory, I suggest they pay > publishers for permission to analyse existing, published translations, and > read those into their memory. This will give them a database of translations > that the market judged good eno

Re: [Foundation-l] Push translation

2010-08-06 Thread Mark Williamson
> On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson wrote: > >> > 2) Implement spelling and punctuation check automatically within GTTK >> before >> > posting of the articles. >> > >> > There is spell check in Translator Toolkit, although it's not available >> for >> > all languages.  We don't have

Re: [Foundation-l] Push translation

2010-08-06 Thread Andreas Kolbe
--- On Sat, 31/7/10, Nikola Smolenski wrote: > Interestingly, I have had a completely opposite > experiences. When reading a > Google translation, it is easy for me to decipher what does > it mean even if > it is not gramatically correct. When translating, I often > hang on deciding > what sent

Re: [Foundation-l] Push translation

2010-08-06 Thread David Gerard
On 6 August 2010 18:47, Michael Galvez wrote: > 3. We acquire dictionaries on limited licenses from other parties.  In > general, while we can surface this content on our own sites (e.g., Google > Translate, Google Dictionary, Google Translator Toolkit), we don't have > permission to donate that

Re: [Foundation-l] Push translation

2010-08-06 Thread Amir E. Aharoni
Dear Michael, I also thank you for joining the discussion. See my question below. 2010/8/6 Michael Galvez : >> Also, as far as Indic languages go, I would ask if there's any chance >> you have any Oriya speakers - with 637 articles, the Oriya Wikipedia >> is by far the most anemic of Indic-languag

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Snow
Michael Galvez wrote: > 2. Once the articles exist in multiple languages, the articles take on a > life of their own and become out of sync. If Wikipedians want to keep those > articles in sync, we would like to help them by enabling section-level > translation. > I'm guessing that few communit

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Galvez
Hi Mark, Responses inline. Mike On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson wrote: > > 2) Implement spelling and punctuation check automatically within GTTK > before > > posting of the articles. > > > > There is spell check in Translator Toolkit, although it's not available > for > > all l

Re: [Foundation-l] Push translation

2010-08-06 Thread Michael Galvez
Hi Lars, Thanks for the detailed feedback. Some comments inline. Mike On Thu, Aug 5, 2010 at 1:39 PM, Lars Aronsson wrote: > On 08/05/2010 03:12 PM, Michael Galvez wrote: > > Sorry for coming into this discussion a bit late. I'm one of the members > of > > Google's translation team, and I wa

Re: [Foundation-l] Push translation

2010-08-05 Thread Mark Williamson
> 2) Implement spelling and punctuation check automatically within GTTK before > posting of the articles. > > There is spell check in Translator Toolkit, although it's not available for > all languages.  We don't have any punctuation checks today and I doubt that > we can release this anytime soon.

Re: [Foundation-l] Push translation

2010-08-05 Thread Lars Aronsson
On 08/05/2010 03:12 PM, Michael Galvez wrote: > Sorry for coming into this discussion a bit late. I'm one of the members of > Google's translation team, and I wanted to make myself available for > feedback/questions. This is an unusual and most welcome step for Google. When I first learned about

Re: [Foundation-l] Push translation

2010-08-05 Thread Michael Galvez
Sorry for coming into this discussion a bit late. I'm one of the members of Google's translation team, and I wanted to make myself available for feedback/questions. Quoting some suggestions from Mark earlier in the thread: 1) Fix some of the formatting errors with GTTK. Would this really be so d

Re: [Foundation-l] Push translation

2010-08-04 Thread Federico Leva (Nemo)
Aphaia, 27/07/2010 21:33: > I've noticed many of English Wikipedia articles cite only English > written articles even if the topics are of non-English world. And > normally, specially in the developing world, the most comprehend > sources are found in their own languages - how can those articles be

Re: [Foundation-l] Push translation

2010-08-02 Thread stevertigo
Nikola Smolenski wrote: > Interestingly, I have had a completely opposite experiences. When reading a > Google translation, it is easy for me to decipher what does it mean even if > it is not gramatically correct. When translating, I often hang on deciding > what sentence structure to use, or on

Re: [Foundation-l] Push translation

2010-07-31 Thread Nikola Smolenski
Дана Friday 30 July 2010 02:31:44 Andreas Kolbe написа: > Having tried it tonight, I don't find the Google translator toolkit all > that useful, at least not at this present level of development. To sum up: > > First you read their translation. > > Then you scratch your head: What the deuce is that

Re: [Foundation-l] Push translation

2010-07-30 Thread stevertigo
Muhammad Yahia wrote: >   Where is the community? where is the involvement and exchange of ideas and >   continuous evolvement of articles? where's the wiki in wikipedia? >   - I see it as POV to assume that wiki x has the 'perfect' article on a >   certain subject such that everyone in the world

Re: [Foundation-l] Push translation

2010-07-29 Thread Andreas Kolbe
ke > Subject: Re: [Foundation-l] Push translation > To: "Wikimedia Foundation Mailing List" > Date: Wednesday, 28 July, 2010, 0:27 > Mass machine translations ("pushing" > them onto other projects that may or > may not want them) is a very bad idea. >

Re: [Foundation-l] Push translation

2010-07-29 Thread Muhammad Yahia
My 2c : - I dont know where everyone came up with the notion that the tool produces good results. Most of the articles on both Google's projects on the Arabic wikipedia are barely intelligible, with broken sentences, weird terminology and generally can be spotted right away (see my re

Re: [Foundation-l] Push translation

2010-07-29 Thread Amir E. Aharoni
2010/7/29 Mark Williamson > I don't think that's completely unwise, though. I'm sure they get tons > of crackpot e-mails all the time. I was reading an official blog about > Google Translate, and in the post about their Wikipedia contests, > someone wrote an angry comment that google "must hate S

Re: [Foundation-l] Push translation

2010-07-28 Thread Mark Williamson
Google is, in my experience, very difficult for "regular" people to get in touch with. Sometimes, when a product is in beta, they give you a way to contact them. They used to have an e-mail to contact them at if you had information about bilingual corpora (I found one online from the Nunavut parlia

Re: [Foundation-l] Push translation

2010-07-28 Thread Amir E. Aharoni
Is anyone from Google reading this thread? Because of this thread i tried to play with the Google Translator Toolkit a little and found some technical problems. When i tried to send bug reports about them through the "Contact us" form, i received after a few minutes a "bounce" message from the tra

Re: [Foundation-l] Push translation

2010-07-28 Thread Mark Williamson
Yes, of course if it's not actually reviewed and corrected by a human it's going to be bad. What I said was that if it's used "as it was meant to be used", the results should be indistinguishable from a normal human translation, regardless of the language involved because all mistakes would be fixe

Re: [Foundation-l] Push translation

2010-07-27 Thread Ting Chen
Hello all, I am a heavy translator on WikiMedia projects. I would say more than 95% of my contributions on content is translation. But I am against a blind translation. For example mostly I would translate british or north american related content from en-wp to zh-wp, and not from other lang

Re: [Foundation-l] Push translation

2010-07-27 Thread stevertigo
Ray Saintonge wrote: > Suppose for a minute that your proposal were implemented, and all the > machine translation problems were overcome. Would English NPOV be so > good that community members in the target language would be incapable of > making substantive improvements? And if they did make sub

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
On Wed, Jul 28, 2010 at 7:26 AM, Michael Snow wrote: > Aphaia wrote: >> Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be >> called toolkit, but it doesn't change it is useless; it cannot deal >> with syntax properly, i.e. conjugation etc. at this moment.  Intended >> to be "re

Re: [Foundation-l] Push translation

2010-07-27 Thread Cool Hand Luke
Mass machine translations ("pushing" them onto other projects that may or may not want them) is a very bad idea. Beginning in 2004-05, a non-native speaker on en.wp decided that he should import slightly-cleaned babelfish translations of foreign language articles that did not have articles on the

Re: [Foundation-l] Push translation

2010-07-27 Thread Michael Snow
Aphaia wrote: > Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be > called toolkit, but it doesn't change it is useless; it cannot deal > with syntax properly, i.e. conjugation etc. at this moment. Intended > to be "reviewed and corrected by a human" doesn't assure it was reall

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be called toolkit, but it doesn't change it is useless; it cannot deal with syntax properly, i.e. conjugation etc. at this moment. Intended to be "reviewed and corrected by a human" doesn't assure it was really "reviewed and correc

Re: [Foundation-l] Push translation

2010-07-27 Thread Casey Brown
On Tue, Jul 27, 2010 at 3:44 PM, Mark Williamson wrote: > Aphaia, Shiju Alex and I are referring to Google Translator Toolkit, > not Google Translate. If the person using the Toolkit uses it as it > was _meant_ to be used, the results should be as good as a human > translation because they've been

Re: [Foundation-l] Push translation

2010-07-27 Thread Mark Williamson
Aphaia, Shiju Alex and I are referring to Google Translator Toolkit, not Google Translate. If the person using the Toolkit uses it as it was _meant_ to be used, the results should be as good as a human translation because they've been reviewed and corrected by a human. -m. On Tue, Jul 27, 2010 a

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
GT fails. At least for Japanese, it sucks. And that is why I don't support it. GT may fit to SVO languages, but for SOV languages, it is nothing but a crap. Imagine to fix a 4000 words of documents whose all lines are sort of "all your base is belong to us". It's not a simple thing as you imagine

Re: [Foundation-l] Push translation

2010-07-27 Thread Aphaia
I've noticed many of English Wikipedia articles cite only English written articles even if the topics are of non-English world. And normally, specially in the developing world, the most comprehend sources are found in their own languages - how can those articles be assured in NPOV when they ignore

Re: [Foundation-l] Push translation

2010-07-27 Thread John Vandenberg
On Tue, Jul 27, 2010 at 8:42 PM, David Gerard wrote: > Because such a statement is factually inaccurate - en:wp *did* use the > 1911EB as starter material. ..and for [[Accius]], with 150 views per month, not even a single word has been added after three years. -- John Vandenberg ___

Re: [Foundation-l] Push translation

2010-07-27 Thread Shiju Alex
On Tue, Jul 27, 2010 at 11:42 AM, Mark Williamson wrote: > > 4) Include a list of most needed articles for people to create, rather > than random articles that will be of little use to local readers. Some > articles, such as those on local topics, have the added benefit of > encouraging more edit

Re: [Foundation-l] Push translation

2010-07-27 Thread Magnus Manske
On Tue, Jul 27, 2010 at 11:42 AM, David Gerard wrote: > On 27 July 2010 09:36, Shiju Alex wrote: > >> Wiki communities like the biological growth of the wikipedia articles in >> their wiki. Why English Wikipedia did not start building wikipedia articles >> using *Encyclopedia Britannica 1911* edi

Re: [Foundation-l] Push translation

2010-07-27 Thread David Gerard
On 27 July 2010 09:36, Shiju Alex wrote: > Wiki communities like the biological growth of the wikipedia articles in > their wiki. Why English Wikipedia did not start building wikipedia articles > using *Encyclopedia Britannica 1911* edition which was available in the > public domain? Er, are yo

Re: [Foundation-l] Push translation

2010-07-27 Thread Mark Williamson
On Tue, Jul 27, 2010 at 1:36 AM, Shiju Alex wrote: >   1. Ban the project of Google as done by the Bengali wiki community (Bad >   solution, and I am personally against this solution) >   2. Ask Google to engage wiki community (As happened in the case of Tamil) >   to find out a working solution.

Re: [Foundation-l] Push translation

2010-07-27 Thread Shiju Alex
> > Google Translator Toolkit is particularly problematic because it > messes up the existing article formatting (one example, it messes up > internal links by putting punctuation marks before double brackets > when they should be after) and it includes incompatible formatting > such as redlinked t

Re: [Foundation-l] Push translation

2010-07-27 Thread Ray Saintonge
stevertigo wrote: > Mark Williamson wrote: > >> I would like to add to this that I think the worst part of this idea >> is the assumption that other languages should take articles from >> en.wp. >> > The idea is that most of en.wp's articles are well-enough written, and > written in accord

Re: [Foundation-l] Push translation

2010-07-27 Thread Ray Saintonge
Mark Williamson wrote: > Google Translator Toolkit is particularly problematic because it > messes up the existing article formatting (one example, it messes up > internal links by putting punctuation marks before double brackets > when they should be after) and it includes incompatible formatting

Re: [Foundation-l] Push translation

2010-07-26 Thread Mark Williamson
Shiju Alex, Stevertigo is just one en.wikipedian. As far as using exact copies goes, I don't know about the policy at your home wiki, but in many Wikipedias this sort of back-and-forth translation and trading and sharing of articles has been going on since day one, not just with English but with

Re: [Foundation-l] Push translation

2010-07-26 Thread Shiju Alex
> > really? It's a) not > particularly well-written, mostly and b) referenced overwhelmingly to > >> English-language sources, most of which are, you guessed it.. Western in >> > nature. > Very much true. Now English Wikipedians want some one to translate and use the exact copy of en:wp in all oth

Re: [Foundation-l] Push translation

2010-07-26 Thread Oliver Keyes
"The idea is that most of en.wp's articles are well-enough written, and written in accord with NPOV to a sufficient degree to overcome any such criticism of 'imperial encyclopedism.' - really? It's a) not particularly well-written, mostly and b) referenced overwhelmingly to English-language sources

Re: [Foundation-l] Push translation

2010-07-26 Thread stevertigo
Mark Williamson wrote: > I would like to add to this that I think the worst part of this idea > is the assumption that other languages should take articles from > en.wp. The idea is that most of en.wp's articles are well-enough written, and written in accord with NPOV to a sufficient degree to ov

Re: [Foundation-l] Push translation

2010-07-26 Thread Pavlo Shevelo
> I don't know whether other wikipedias have similar policies, but on > the Italian Wikipedia an article which is just a machine translation > can be speedy deleted according to our policies. The reason is that > machine translations are not good enough and the autotranslated text > is too difficul

Re: [Foundation-l] Push translation

2010-07-26 Thread Marco Chiesa
On Sat, Jul 24, 2010 at 2:57 AM, stevertigo wrote: > Translation between wikis currently exists as a largely pulling > paradigm: Someone on the target wiki finds an article in another > language (English for example) and then pulls it to their language > wiki. > > These days Google and other trans

Re: [Foundation-l] Push translation

2010-07-25 Thread Mark Williamson
I would like to add to this that I think the worst part of this idea is the assumption that other languages should take articles from en.wp. I would be in favor of an international, language-free Wikipedia if/when perfect (or 99.99% accurate) MT software exists, but that is not currently the case.

Re: [Foundation-l] Push translation

2010-07-25 Thread Mark Williamson
On Sat, Jul 24, 2010 at 11:03 PM, Casey Brown wrote: > On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson wrote: >> Wikipedias are not for _cultures_, they are for languages. If I and > > I'm surprised to hear that coming from someone who I thought to be a > student of languages.  I think you might

Re: [Foundation-l] Push translation

2010-07-25 Thread Ray Saintonge
stevertigo wrote: > Translation between wikis currently exists as a largely pulling > paradigm: Someone on the target wiki finds an article in another > language (English for example) and then pulls it to their language > wiki. > > These days Google and other translate tools are good enough to use

Re: [Foundation-l] Push translation

2010-07-24 Thread Casey Brown
On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson wrote: > Wikipedias are not for _cultures_, they are for languages. If I and I'm surprised to hear that coming from someone who I thought to be a student of languages. I think you might want to read an article from today's Wall Street Journal, abo

Re: [Foundation-l] Push translation

2010-07-24 Thread Mark Williamson
Bence, that's a different topic - MAT (Machine Aided Translation), and in the case of Bengali, I believe simply the use of a translation memory system. Some of the comments on that page seem to be quite misinformed, ranging from people who thought Google was inserting unrevised machine translations

Re: [Foundation-l] Push translation

2010-07-24 Thread Mark Williamson
Wikipedias are not for _cultures_, they are for languages. If I and 1,000 other Americans suddenly learnt French (to the point of native-level fluency) and decided to read and edit the French Wikipedia, it would "belong" to us just as much as to anybody else. This came up recently in the debate abo

Re: [Foundation-l] Push translation

2010-07-24 Thread Oliver Keyes
Agreed. There's one wiki which artificially inflated the number of articles it had via a bot (I forget the specific language). That's not a way to increase the wiki's strength. There's an old phrase used on en-wiki; "africa is not a redlink". It means that because we have articles on a lot of commo

Re: [Foundation-l] Push translation

2010-07-24 Thread Cristian Consonni
2010/7/24 Casey Brown : > On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo > wrote: >>> These days Google and other translate tools are good enough to use as >>> the starting basis for an translated article >> >> No, it's far not true - at least for such target language as Ukrainian etc. >> >> So a

Re: [Foundation-l] Push translation

2010-07-24 Thread Casey Brown
On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo wrote: >> These days Google and other translate tools are good enough to use as >> the starting basis for an translated article > > No, it's far not true - at least for such target language as Ukrainian etc. > > So any attempt of "push" translation wi

Re: [Foundation-l] Push translation

2010-07-24 Thread Oliver Keyes
"If there are issues, they can be overcome. The fact of the matter is that the vast majority of articles in English can be "pushed" over to other languages, and fill a need for those topics in those languages." - if there are vast swathes in other languages that aren't filled, it's normally indica

Re: [Foundation-l] Push translation

2010-07-24 Thread Bence Damokos
As far as push translation goes, there are languages where it could almost work and where it couldn't. (Consider the experience of the Google team with the Bengali Wikipedia - http://googletranslate.blogspot.com/2010/07/translating-wikipedia.html ) Bence ___

Re: [Foundation-l] Push translation

2010-07-24 Thread Pavlo Shevelo
> These days Google and other translate tools are good enough to use as > the starting basis for an translated article No, it's far not true - at least for such target language as Ukrainian etc. So any attempt of "push" translation will be almost the disaster... On Sat, Jul 24, 2010 at 3:57 AM,

[Foundation-l] Push translation

2010-07-24 Thread stevertigo
Translation between wikis currently exists as a largely pulling paradigm: Someone on the target wiki finds an article in another language (English for example) and then pulls it to their language wiki. These days Google and other translate tools are good enough to use as the starting basis for an