On Thu, Apr 2, 2009 at 3:52 PM, Aryeh Gregor
wrote:
> I suspect this would be feasible to get working to an acceptable
> level, but only with a lot of effort. Natural languages are really
> messy. :(
If you treat words as strings, they are really messy, yes. But, if you
treat words as words, yo
Ziko van Dijk wrote:
> Dear Aryeh,
>
> Your idea of "converting on the fly" would not work in many cases. Take for
> example the ß in German WP. Swiss (registered) readers can decide via their
> Preferences to see only ss and never ß, because the Swiss do not use ß.
> That's ok. But vice versa, no
On Thu, Apr 2, 2009 at 3:49 AM, Ray Saintonge wrote:
> When you declare one version canonical the risk is that you will have
> supporters of the losing version(s) becoming irrationally angry.
Which version was canonical is an implementation detail that wouldn't
even be visible to contributors, so
Dear Aryeh,
Your idea of "converting on the fly" would not work in many cases. Take for
example the ß in German WP. Swiss (registered) readers can decide via their
Preferences to see only ss and never ß, because the Swiss do not use ß.
That's ok. But vice versa, not every ss is to be converted to
On Wed, Apr 1, 2009 at 5:32 PM, Ziko van Dijk wrote:
> - a split of the Wikipedias into two; this is most likely when there are
> other linguistic differences e.g. in dictionary.
Dictionary is not a problem. This is the option for Ekavian-Iyekavian
conversion engine for Serbian. I made an algorit
On Thu, Apr 2, 2009 at 9:49 AM, Ray Saintonge wrote:
> Aryeh Gregor wrote:
>> On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk
>> wrote:
>>
>>> I am sceptical about automatic conversion. As you said, it is mainly a
>>> solution for reading, but not for writing, because the source text is in one
>>
Aryeh Gregor wrote:
> On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk
> wrote:
>
>> I am sceptical about automatic conversion. As you said, it is mainly a
>> solution for reading, but not for writing, because the source text is in one
>> specific spelling or character system.
>>
> Why coul
On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk wrote:
> I am sceptical about automatic conversion. As you said, it is mainly a
> solution for reading, but not for writing, because the source text is in one
> specific spelling or character system.
Why couldn't that be converted on the fly as well?
Milos, thank you for the very comprehensive presentation of the problem.
There are other cases that could be mentioned, it is indeed a problem
touching most of the language editions.
I am sceptical about automatic conversion. As you said, it is mainly a
solution for reading, but not for writing, be
On Wed, Apr 1, 2009 at 2:11 PM, Tim Starling wrote:
> It sounds like a good project for a directed grant. Have you tried
> contacting potential grant-making organisations? I imagine some
> awesome things could be done with as little as $100K.
First, sorry for forgetting you. You were the only per
On Wed, Apr 1, 2009 at 2:40 PM, Ting Chen wrote:
> As far as I know, you can define escapes globally for the whole article.
> This would make an escape in every sentence unnecessary. Take as example
> the following example:
> http://zh.wikipedia.org/wiki/%E6%96%AF%E6%B4%9B%E5%8D%9A%E4%B8%B9%C2%B7%
Milos Rancic wrote:
> Chinese is a little bit more complex because there are a number of
> characters. However, AFAIK, Simplified and Traditional scripts share a
> number of characters and some of others may be guessed form context.
>
>
Well, Chinese is not that simple, especially for the differ
> == What do we need? ==
>
> Actually, we don't need a lot to solve this problem. I have the
> solution for the most important part of the problem, the linguistic
> one. Even if I don't have enough of time to deal with all cases, I am
> able to find students or professors of linguists who are will
For a couple of years I am talking to different people inside of WMF
about the need for solving conversion engines issue systematically.
However, all of the responses which I am getting are non-understanding
(in better cases) or silence.
== Why do we need conversion engines? ==
Unlike, for exampl
14 matches
Mail list logo