On 30/12/2015 01:46, Ehsan Akhgari wrote:
First things first, let's correct something here. We do _not_ maintain
three word lists. We maintain one list: the list of words that the
Firefox spellchecker accepts.
I know I sound like a broken record: I suggested to change the process
and maintain three lists.
I'm afraid you're misunderstanding what's happened here. We only
maintain one word list, and our process of merging upstream changes is
purely additive. As a result, it doesn't handle the case where a word
disappears from SCOWL.
This is clearly a bug, and should be fixed.
I came to realise that my argument has a hole. On one had I'm
complaining that at the beginning of May 2015 words got removed, see:
https://hg.mozilla.org/mozilla-central/diff/bcb133a3cdca/extensions/spellcheck/locales/en-US/hunspell/dictionary-sources/orig/en_US.dic
(don't open in Firefox, it will hang, bug 1235321):
-relict
-residuary
-enforceability
(all still included in the "large" dataset).
On the other hand I'm complaining that wrong entries, like "remind's"
are maintained in the Mozilla data. "remind's" is not a valid word in
SCOWL (http://app.aspell.net/lookup?dict=en_US&words=remind%27s), but it
is in Mozilla. So there is a bug in the removal process.
Frankly, I can't understand how the current system could manage SCOWL
removals yet not remove words Mozilla specifically added. How does it
know that a word came from SCOWL and can be removed or it didn't come
from SCOWL and should be maintained? Broken record: Maintaining Mozilla
words differently (three lists) would fix this.
We may still decide to keep
individual words that SCOWL drops if we decide that we want the Firefox
spell checker to accept them, but as a general rule we should probably
follow upstream.
It is pretty much unmanageable to do this. On every refresh you would
have to add removed words manually. Broken record: Mozilla should not
manage the general English words (apart from some exceptions, see below).
We should leave it to SCOWL
to manage the plain English dictionary and only manage the Mozilla
additions (for which I see three classes, see above).
I disagree. I think we should accept the words that we want, and then
try to upstream them to SCOWL, without holding Firefox back until that
happens. I experimented with this once
<https://github.com/kevina/wordlist/issues/117> but unfortunately I
haven't had the time to go through all of the list. (As a non-native
speaker this task requires me to spend weeks looking things up in
dictionaries!)
I sound like a broken record but you ignored my proposal: To facilitate
the process of having more words than SCOWL, I proposed to split these
"more words" into three files. The third file would contain "general"
words we request upstream.
Wonderful! If you have a list of words using these types of characters
that we need to add, please file a bug, and let's do that!
No I won't do that. I filed a a bug to use the "large" dictionary, but
you even changed the summary and hijacked it for something else. It
makes no sense to request a heap of words to be added to the Mozilla
dictionary, like "résumé", "née" and so on, which already exist in the
"large" dataset. Broken record: Mozilla doesn't want to be in the
business of managing this. Mozilla should be in the business of managing
Mozilla specific additions, and perhaps a small amount of general words
that get added (third list), which will then be requested upstream.
The current system can't do this, you resist changing it, so I just give
up, since I'm not using the defective en-US spelling anyway.
Jorg K.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform