Michael Galvez wrote:
> Translator Toolkit processes generic Media Wiki text, although this is not
> an officially supported feature and is largely untested. If you upload a
> UTF-8 file with extension ".mediawiki", Translator Toolkit will try to
> render the file in the same way that it renders
On 17 August 2010 03:22, Samuel Klein wrote:
> On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez wrote:
>> On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote:
>>> And the data that GTTK gathers from its use in Wikipedia translations?
>>> What would need to happen for that to start coming back, in
On Fri, Aug 13, 2010 at 4:28 PM, Michael Galvez wrote:
> On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote:
>>
>> And the data that GTTK gathers from its use in Wikipedia translations?
>> What would need to happen for that to start coming back, in a usable
>> form?
>
> The translated segments ar
On Sat, Aug 7, 2010 at 11:30 PM, Lars Aronsson wrote:
> On 08/06/2010 07:47 PM, Michael Galvez wrote:
> > 3. We acquire dictionaries on limited licenses from other parties. In
> > general, while we can surface this content on our own sites (e.g., Google
> > Translate, Google Dictionary, Google T
On Thu, Aug 5, 2010 at 9:45 PM, stevertigo wrote:
> Michael Galvez wrote:
> > Sorry for coming into this discussion a bit late. I'm one of the members
> of
> > Google's translation team, and I wanted to make myself available for
> > feedback/questions.
>
> Thanks for stopping by. A few question
On Sat, Aug 7, 2010 at 4:52 AM, Federico Leva (Nemo) wrote:
> Michael Galvez, 05/08/2010 15:12:
> > Sorry for coming into this discussion a bit late. I'm one of the members
> of
> > Google's translation team, and I wanted to make myself available for
> > feedback/questions.
>
> Thank you, you've
On Sat, Aug 7, 2010 at 1:38 AM, Mark Williamson wrote:
> > On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson
> wrote:
> >
> >> > 2) Implement spelling and punctuation check automatically within GTTK
> >> before
> >> > posting of the articles.
> >> >
> >> > There is spell check in Translator Toolki
On Fri, Aug 6, 2010 at 5:56 PM, David Gerard wrote:
> On 6 August 2010 18:47, Michael Galvez wrote:
>
> > 3. We acquire dictionaries on limited licenses from other parties. In
> > general, while we can surface this content on our own sites (e.g., Google
> > Translate, Google Dictionary, Google
Hi Amir,
Apologies for the late reply. Replies inline below.
Mike
On Fri, Aug 6, 2010 at 3:14 PM, Amir E. Aharoni <
amir.ahar...@mail.huji.ac.il> wrote:
> Dear Michael, I also thank you for joining the discussion. See my
> question below.
>
> 2010/8/6 Michael Galvez :
> >> Also, as far as Ind
On Fri, Aug 6, 2010 at 2:13 PM, Michael Snow wrote:
> Michael Galvez wrote:
> > 2. Once the articles exist in multiple languages, the articles take on a
> > life of their own and become out of sync. If Wikipedians want to keep
> those
> > articles in sync, we would like to help them by enabling
On 08/06/2010 07:47 PM, Michael Galvez wrote:
> 3. We acquire dictionaries on limited licenses from other parties. In
> general, while we can surface this content on our own sites (e.g., Google
> Translate, Google Dictionary, Google Translator Toolkit), we don't have
> permission to donate that da
Michael Galvez wrote:
> Sorry for coming into this discussion a bit late. I'm one of the members of
> Google's translation team, and I wanted to make myself available for
> feedback/questions.
Thanks for stopping by. A few questions: 1) Does GTTK have a specific
API for Mediawiki markup ("wikite
Michael Galvez, 05/08/2010 15:12:
> Sorry for coming into this discussion a bit late. I'm one of the members of
> Google's translation team, and I wanted to make myself available for
> feedback/questions.
Thank you, you've explained some important things.
> There is spell check in Translator Too
Andreas Kolbe, 07/08/2010 02:23:
> If Google want to build up their translation memory, I suggest they pay
> publishers for permission to analyse existing, published translations, and
> read those into their memory. This will give them a database of translations
> that the market judged good eno
> On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson wrote:
>
>> > 2) Implement spelling and punctuation check automatically within GTTK
>> before
>> > posting of the articles.
>> >
>> > There is spell check in Translator Toolkit, although it's not available
>> for
>> > all languages. We don't have
--- On Sat, 31/7/10, Nikola Smolenski wrote:
> Interestingly, I have had a completely opposite
> experiences. When reading a
> Google translation, it is easy for me to decipher what does
> it mean even if
> it is not gramatically correct. When translating, I often
> hang on deciding
> what sent
On 6 August 2010 18:47, Michael Galvez wrote:
> 3. We acquire dictionaries on limited licenses from other parties. In
> general, while we can surface this content on our own sites (e.g., Google
> Translate, Google Dictionary, Google Translator Toolkit), we don't have
> permission to donate that
Dear Michael, I also thank you for joining the discussion. See my
question below.
2010/8/6 Michael Galvez :
>> Also, as far as Indic languages go, I would ask if there's any chance
>> you have any Oriya speakers - with 637 articles, the Oriya Wikipedia
>> is by far the most anemic of Indic-languag
Michael Galvez wrote:
> 2. Once the articles exist in multiple languages, the articles take on a
> life of their own and become out of sync. If Wikipedians want to keep those
> articles in sync, we would like to help them by enabling section-level
> translation.
>
I'm guessing that few communit
Hi Mark,
Responses inline.
Mike
On Thu, Aug 5, 2010 at 2:22 PM, Mark Williamson wrote:
> > 2) Implement spelling and punctuation check automatically within GTTK
> before
> > posting of the articles.
> >
> > There is spell check in Translator Toolkit, although it's not available
> for
> > all l
Hi Lars,
Thanks for the detailed feedback. Some comments inline.
Mike
On Thu, Aug 5, 2010 at 1:39 PM, Lars Aronsson wrote:
> On 08/05/2010 03:12 PM, Michael Galvez wrote:
> > Sorry for coming into this discussion a bit late. I'm one of the members
> of
> > Google's translation team, and I wa
> 2) Implement spelling and punctuation check automatically within GTTK before
> posting of the articles.
>
> There is spell check in Translator Toolkit, although it's not available for
> all languages. We don't have any punctuation checks today and I doubt that
> we can release this anytime soon.
On 08/05/2010 03:12 PM, Michael Galvez wrote:
> Sorry for coming into this discussion a bit late. I'm one of the members of
> Google's translation team, and I wanted to make myself available for
> feedback/questions.
This is an unusual and most welcome step for Google. When I first
learned about
Sorry for coming into this discussion a bit late. I'm one of the members of
Google's translation team, and I wanted to make myself available for
feedback/questions.
Quoting some suggestions from Mark earlier in the thread:
1) Fix some of the formatting errors with GTTK. Would this really be so
d
Aphaia, 27/07/2010 21:33:
> I've noticed many of English Wikipedia articles cite only English
> written articles even if the topics are of non-English world. And
> normally, specially in the developing world, the most comprehend
> sources are found in their own languages - how can those articles be
Nikola Smolenski wrote:
> Interestingly, I have had a completely opposite experiences. When reading a
> Google translation, it is easy for me to decipher what does it mean even if
> it is not gramatically correct. When translating, I often hang on deciding
> what sentence structure to use, or on
Дана Friday 30 July 2010 02:31:44 Andreas Kolbe написа:
> Having tried it tonight, I don't find the Google translator toolkit all
> that useful, at least not at this present level of development. To sum up:
>
> First you read their translation.
>
> Then you scratch your head: What the deuce is that
Muhammad Yahia wrote:
> Where is the community? where is the involvement and exchange of ideas and
> continuous evolvement of articles? where's the wiki in wikipedia?
> - I see it as POV to assume that wiki x has the 'perfect' article on a
> certain subject such that everyone in the world
ke
> Subject: Re: [Foundation-l] Push translation
> To: "Wikimedia Foundation Mailing List"
> Date: Wednesday, 28 July, 2010, 0:27
> Mass machine translations ("pushing"
> them onto other projects that may or
> may not want them) is a very bad idea.
>
My 2c :
- I dont know where everyone came up with the notion that the tool
produces good results. Most of the articles on both Google's projects on the
Arabic wikipedia are barely intelligible, with broken sentences, weird
terminology and generally can be spotted right away (see my re
2010/7/29 Mark Williamson
> I don't think that's completely unwise, though. I'm sure they get tons
> of crackpot e-mails all the time. I was reading an official blog about
> Google Translate, and in the post about their Wikipedia contests,
> someone wrote an angry comment that google "must hate S
Google is, in my experience, very difficult for "regular" people to
get in touch with. Sometimes, when a product is in beta, they give you
a way to contact them. They used to have an e-mail to contact them at
if you had information about bilingual corpora (I found one online
from the Nunavut parlia
Is anyone from Google reading this thread?
Because of this thread i tried to play with the Google Translator Toolkit a
little and found some technical problems. When i tried to send bug reports
about them through the "Contact us" form, i received after a few minutes a
"bounce" message from the tra
Yes, of course if it's not actually reviewed and corrected by a human
it's going to be bad. What I said was that if it's used "as it was
meant to be used", the results should be indistinguishable from a
normal human translation, regardless of the language involved because
all mistakes would be fixe
Hello all,
I am a heavy translator on WikiMedia projects. I would say more than 95%
of my contributions on content is translation. But I am against a blind
translation. For example mostly I would translate british or north
american related content from en-wp to zh-wp, and not from other
lang
Ray Saintonge wrote:
> Suppose for a minute that your proposal were implemented, and all the
> machine translation problems were overcome. Would English NPOV be so
> good that community members in the target language would be incapable of
> making substantive improvements? And if they did make sub
On Wed, Jul 28, 2010 at 7:26 AM, Michael Snow wrote:
> Aphaia wrote:
>> Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be
>> called toolkit, but it doesn't change it is useless; it cannot deal
>> with syntax properly, i.e. conjugation etc. at this moment. Intended
>> to be "re
Mass machine translations ("pushing" them onto other projects that may or
may not want them) is a very bad idea.
Beginning in 2004-05, a non-native speaker on en.wp decided that he should
import slightly-cleaned babelfish translations of foreign language articles
that did not have articles on the
Aphaia wrote:
> Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be
> called toolkit, but it doesn't change it is useless; it cannot deal
> with syntax properly, i.e. conjugation etc. at this moment. Intended
> to be "reviewed and corrected by a human" doesn't assure it was reall
Ah, I omitted T, and I meant Toolkit. A toolkit with garbage could be
called toolkit, but it doesn't change it is useless; it cannot deal
with syntax properly, i.e. conjugation etc. at this moment. Intended
to be "reviewed and corrected by a human" doesn't assure it was really
"reviewed and correc
On Tue, Jul 27, 2010 at 3:44 PM, Mark Williamson wrote:
> Aphaia, Shiju Alex and I are referring to Google Translator Toolkit,
> not Google Translate. If the person using the Toolkit uses it as it
> was _meant_ to be used, the results should be as good as a human
> translation because they've been
Aphaia, Shiju Alex and I are referring to Google Translator Toolkit,
not Google Translate. If the person using the Toolkit uses it as it
was _meant_ to be used, the results should be as good as a human
translation because they've been reviewed and corrected by a human.
-m.
On Tue, Jul 27, 2010 a
GT fails. At least for Japanese, it sucks. And that is why I don't
support it. GT may fit to SVO languages, but for SOV languages, it is
nothing but a crap.
Imagine to fix a 4000 words of documents whose all lines are sort of
"all your base is belong to us". It's not a simple thing as you
imagine
I've noticed many of English Wikipedia articles cite only English
written articles even if the topics are of non-English world. And
normally, specially in the developing world, the most comprehend
sources are found in their own languages - how can those articles be
assured in NPOV when they ignore
On Tue, Jul 27, 2010 at 8:42 PM, David Gerard wrote:
> Because such a statement is factually inaccurate - en:wp *did* use the
> 1911EB as starter material.
..and for [[Accius]], with 150 views per month, not even a single word
has been added after three years.
--
John Vandenberg
___
On Tue, Jul 27, 2010 at 11:42 AM, Mark Williamson wrote:
>
> 4) Include a list of most needed articles for people to create, rather
> than random articles that will be of little use to local readers. Some
> articles, such as those on local topics, have the added benefit of
> encouraging more edit
On Tue, Jul 27, 2010 at 11:42 AM, David Gerard wrote:
> On 27 July 2010 09:36, Shiju Alex wrote:
>
>> Wiki communities like the biological growth of the wikipedia articles in
>> their wiki. Why English Wikipedia did not start building wikipedia articles
>> using *Encyclopedia Britannica 1911* edi
On 27 July 2010 09:36, Shiju Alex wrote:
> Wiki communities like the biological growth of the wikipedia articles in
> their wiki. Why English Wikipedia did not start building wikipedia articles
> using *Encyclopedia Britannica 1911* edition which was available in the
> public domain?
Er, are yo
On Tue, Jul 27, 2010 at 1:36 AM, Shiju Alex wrote:
> 1. Ban the project of Google as done by the Bengali wiki community (Bad
> solution, and I am personally against this solution)
> 2. Ask Google to engage wiki community (As happened in the case of Tamil)
> to find out a working solution.
>
> Google Translator Toolkit is particularly problematic because it
> messes up the existing article formatting (one example, it messes up
> internal links by putting punctuation marks before double brackets
> when they should be after) and it includes incompatible formatting
> such as redlinked t
stevertigo wrote:
> Mark Williamson wrote:
>
>> I would like to add to this that I think the worst part of this idea
>> is the assumption that other languages should take articles from
>> en.wp.
>>
> The idea is that most of en.wp's articles are well-enough written, and
> written in accord
Mark Williamson wrote:
> Google Translator Toolkit is particularly problematic because it
> messes up the existing article formatting (one example, it messes up
> internal links by putting punctuation marks before double brackets
> when they should be after) and it includes incompatible formatting
Shiju Alex,
Stevertigo is just one en.wikipedian.
As far as using exact copies goes, I don't know about the policy at
your home wiki, but in many Wikipedias this sort of back-and-forth
translation and trading and sharing of articles has been going on
since day one, not just with English but with
>
> really? It's a) not
> particularly well-written, mostly and b) referenced overwhelmingly to
>
>> English-language sources, most of which are, you guessed it.. Western in
>>
> nature.
>
Very much true. Now English Wikipedians want some one to translate and use
the exact copy of en:wp in all oth
"The idea is that most of en.wp's articles are well-enough written, and
written in accord with NPOV to a sufficient degree to overcome any
such criticism of 'imperial encyclopedism.' - really? It's a) not
particularly well-written, mostly and b) referenced overwhelmingly to
English-language sources
Mark Williamson wrote:
> I would like to add to this that I think the worst part of this idea
> is the assumption that other languages should take articles from
> en.wp.
The idea is that most of en.wp's articles are well-enough written, and
written in accord with NPOV to a sufficient degree to ov
> I don't know whether other wikipedias have similar policies, but on
> the Italian Wikipedia an article which is just a machine translation
> can be speedy deleted according to our policies. The reason is that
> machine translations are not good enough and the autotranslated text
> is too difficul
On Sat, Jul 24, 2010 at 2:57 AM, stevertigo wrote:
> Translation between wikis currently exists as a largely pulling
> paradigm: Someone on the target wiki finds an article in another
> language (English for example) and then pulls it to their language
> wiki.
>
> These days Google and other trans
I would like to add to this that I think the worst part of this idea
is the assumption that other languages should take articles from
en.wp.
I would be in favor of an international, language-free Wikipedia
if/when perfect (or 99.99% accurate) MT software exists, but that is
not currently the case.
On Sat, Jul 24, 2010 at 11:03 PM, Casey Brown wrote:
> On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson wrote:
>> Wikipedias are not for _cultures_, they are for languages. If I and
>
> I'm surprised to hear that coming from someone who I thought to be a
> student of languages. I think you might
stevertigo wrote:
> Translation between wikis currently exists as a largely pulling
> paradigm: Someone on the target wiki finds an article in another
> language (English for example) and then pulls it to their language
> wiki.
>
> These days Google and other translate tools are good enough to use
On Sun, Jul 25, 2010 at 1:39 AM, Mark Williamson wrote:
> Wikipedias are not for _cultures_, they are for languages. If I and
I'm surprised to hear that coming from someone who I thought to be a
student of languages. I think you might want to read an
article from today's Wall Street Journal, abo
Bence, that's a different topic - MAT (Machine Aided Translation), and
in the case of Bengali, I believe simply the use of a translation
memory system. Some of the comments on that page seem to be quite
misinformed, ranging from people who thought Google was inserting
unrevised machine translations
Wikipedias are not for _cultures_, they are for languages. If I and
1,000 other Americans suddenly learnt French (to the point of
native-level fluency) and decided to read and edit the French
Wikipedia, it would "belong" to us just as much as to anybody else.
This came up recently in the debate abo
Agreed. There's one wiki which artificially inflated the number of articles
it had via a bot (I forget the specific language). That's not a way to
increase the wiki's strength. There's an old phrase used on en-wiki; "africa
is not a redlink". It means that because we have articles on a lot of commo
2010/7/24 Casey Brown :
> On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo
> wrote:
>>> These days Google and other translate tools are good enough to use as
>>> the starting basis for an translated article
>>
>> No, it's far not true - at least for such target language as Ukrainian etc.
>>
>> So a
On Sat, Jul 24, 2010 at 4:11 PM, Pavlo Shevelo wrote:
>> These days Google and other translate tools are good enough to use as
>> the starting basis for an translated article
>
> No, it's far not true - at least for such target language as Ukrainian etc.
>
> So any attempt of "push" translation wi
"If there are issues, they can be overcome. The fact of the matter is
that the vast majority of articles in English can be "pushed" over to
other languages, and fill a need for those topics in those languages." - if
there are vast swathes in other languages that aren't filled, it's normally
indica
As far as push translation goes, there are languages where it could almost
work and where it couldn't. (Consider the experience of the Google team with
the Bengali Wikipedia -
http://googletranslate.blogspot.com/2010/07/translating-wikipedia.html )
Bence
___
> These days Google and other translate tools are good enough to use as
> the starting basis for an translated article
No, it's far not true - at least for such target language as Ukrainian etc.
So any attempt of "push" translation will be almost the disaster...
On Sat, Jul 24, 2010 at 3:57 AM,
Translation between wikis currently exists as a largely pulling
paradigm: Someone on the target wiki finds an article in another
language (English for example) and then pulls it to their language
wiki.
These days Google and other translate tools are good enough to use as
the starting basis for an
71 matches
Mail list logo