Re: List of typos.

Ondřej Bílka Fri, 05 Jul 2013 22:45:06 -0700

On Fri, Jul 05, 2013 at 05:17:54PM +0100, Jonathan Wakely wrote:
> On 5 July 2013 16:43, Ondřej Bílka wrote:
> >
> > Hi, I ran aspell on comments in gcc. After bit of cleaning a list with
> > frequencies is here. It is still relatively noisy and more heuristics
> > are needed.
> >
> > http://kam.mff.cuni.cz/~ondra/gcc_misspells
> >
> > What we will do with this now?
> 
> It doesn't look very useful yet, clearly "namespace" and "param" are not 
> errors.
We need to teach aspell about these. I am thinking about creating shared
wordlist that will gcc developers use. It is mainly logistics problem, I
could imagine having shared file on sourceware and using script like
this.


scp remote_wordlist wordlist 
aspell merge english wordlist
aspell -m wordlist -p new    
scp remote_wordlist wordlist # To decrease race conditions.
aspell merge wordlist new
scp wordlist remote_wordlist

> 
> "acccepted" and "accestor" and "actullay" are real spelling mistakes,
> but someone will have to do a grep through the whole tree to see where
> they come from, and then ignore all the ones in ChangeLog files.

If I could extract score from which aspell determines candidate I can
sort them from most likely ones. I tried to write to aspell-user but got no
response yet.

This touches only comments, not changelogs.

Ondra

Re: List of typos.

Reply via email to