Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Milos Rancic
On Thu, Apr 2, 2009 at 3:52 PM, Aryeh Gregor wrote: > I suspect this would be feasible to get working to an acceptable > level, but only with a lot of effort.  Natural languages are really > messy.  :( If you treat words as strings, they are really messy, yes. But, if you treat words as words, yo

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Tim Starling
Ziko van Dijk wrote: > Dear Aryeh, > > Your idea of "converting on the fly" would not work in many cases. Take for > example the ß in German WP. Swiss (registered) readers can decide via their > Preferences to see only ss and never ß, because the Swiss do not use ß. > That's ok. But vice versa, no

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Aryeh Gregor
On Thu, Apr 2, 2009 at 3:49 AM, Ray Saintonge wrote: > When you declare one version canonical the risk is that you will have > supporters of the losing version(s) becoming irrationally angry. Which version was canonical is an implementation detail that wouldn't even be visible to contributors, so

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Ziko van Dijk
Dear Aryeh, Your idea of "converting on the fly" would not work in many cases. Take for example the ß in German WP. Swiss (registered) readers can decide via their Preferences to see only ss and never ß, because the Swiss do not use ß. That's ok. But vice versa, not every ss is to be converted to

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Milos Rancic
On Wed, Apr 1, 2009 at 5:32 PM, Ziko van Dijk wrote: > - a split of the Wikipedias into two; this is most likely when there are > other linguistic differences e.g. in dictionary. Dictionary is not a problem. This is the option for Ekavian-Iyekavian conversion engine for Serbian. I made an algorit

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Milos Rancic
On Thu, Apr 2, 2009 at 9:49 AM, Ray Saintonge wrote: > Aryeh Gregor wrote: >> On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk >> wrote: >> >>> I am sceptical about automatic conversion. As you said, it is mainly a >>> solution for reading, but not for writing, because the source text is in one >>

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-02 Thread Ray Saintonge
Aryeh Gregor wrote: > On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk > wrote: > >> I am sceptical about automatic conversion. As you said, it is mainly a >> solution for reading, but not for writing, because the source text is in one >> specific spelling or character system. >> > Why coul

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Aryeh Gregor
On Wed, Apr 1, 2009 at 11:32 AM, Ziko van Dijk wrote: > I am sceptical about automatic conversion. As you said, it is mainly a > solution for reading, but not for writing, because the source text is in one > specific spelling or character system. Why couldn't that be converted on the fly as well?

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Ziko van Dijk
Milos, thank you for the very comprehensive presentation of the problem. There are other cases that could be mentioned, it is indeed a problem touching most of the language editions. I am sceptical about automatic conversion. As you said, it is mainly a solution for reading, but not for writing, be

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Milos Rancic
On Wed, Apr 1, 2009 at 2:11 PM, Tim Starling wrote: > It sounds like a good project for a directed grant. Have you tried > contacting potential grant-making organisations? I imagine some > awesome things could be done with as little as $100K. First, sorry for forgetting you. You were the only per

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Milos Rancic
On Wed, Apr 1, 2009 at 2:40 PM, Ting Chen wrote: > As far as I know, you can define escapes globally for the whole article. > This would make an escape in every sentence unnecessary. Take as example > the following example: > http://zh.wikipedia.org/wiki/%E6%96%AF%E6%B4%9B%E5%8D%9A%E4%B8%B9%C2%B7%

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Ting Chen
Milos Rancic wrote: > Chinese is a little bit more complex because there are a number of > characters. However, AFAIK, Simplified and Traditional scripts share a > number of characters and some of others may be guessed form context. > > Well, Chinese is not that simple, especially for the differ

Re: [Foundation-l] Frustration with the conversion engines issue

2009-04-01 Thread Tim Starling
> == What do we need? == > > Actually, we don't need a lot to solve this problem. I have the > solution for the most important part of the problem, the linguistic > one. Even if I don't have enough of time to deal with all cases, I am > able to find students or professors of linguists who are will

[Foundation-l] Frustration with the conversion engines issue

2009-03-31 Thread Milos Rancic
For a couple of years I am talking to different people inside of WMF about the need for solving conversion engines issue systematically. However, all of the responses which I am getting are non-understanding (in better cases) or silence. == Why do we need conversion engines? == Unlike, for exampl