On 3/01/2016 19:11, Jesper Kristensen wrote:
I don't think it is that special. Some Firefox locales other than en-US
ship with built in dictionaries. For those, the add-on could be derived
from the source of the Firefox locale.

It is special since Mozilla maintain the dictionary, they don't just copy an upstream source:
https://hg.mozilla.org/mozilla-central/log/tip/extensions/spellcheck/locales/en-US/hunspell/en-US.dic

I maintain the Danish dictionary add-on on AMO. Whenever upstream
releases a new version, I commit it to the Firefox localization source,
and from there I have a script to generate an identical add-on for AMO:

I think this is a very good approach. Your script shows that it is simple to ship a dictionary as an add-on. As I said: Someone should start updating the "official" en-US add-on on AMO again.

A little more background so that you see that English is much more complicated than Danish:

When I started this thread, my aim was to stop maintaining the Mozilla en-US dictionary and use whatever the upstream source, SCOWL in this case, provides.

This was met fierce opposition. Mozilla use SCOWL data, but carry forward additional words: Currently 6000 (doubtful) proper names, 37 Mozilla terms, 337 extra words and also 354 erroneous words which I am about to remove.

If I had to decide, I'd use the SCOWL data, perhaps add the 37 Mozilla terms for the geeks, so if they write "SpiderMonkey", they don't get an error, and be done with it.

Since as an add-on author I can decide, I did exactly that. I took the SCOWL data and put it into an add-on. The end. I didn't add the Mozilla terms, simply because most users have never heard of "SpiderMonkey" and won't use this word. Those who do, can add it to their personal dictionary.

SCOWL provide various "sizes". I used their "large" size, especially since I know that SCOWL moved many useful and common words from their "size 60" ("normal") to "size 70" ("large"). I also know that SCOWL "size 60" doesn't contain common variants, like "advisor" instead of "adviser". I have proposed to use the "large" size, but that was also rejected. Ehsan's approach is to keep using the "normal" size, but do a customised version to include the common variants. There are also efforts to recover some of the 5670 words lost due to SCOWL-internal changes, either by getting SCOWL to reclassify them or be adding them back independently of SCOWL.

In other words, we're getting deeper entangled in a business I think Mozilla shouldn't be in.

Jorg K.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to