Am 02.09.2010 um 18:22 schrieb Stephan Witt: > Am 02.09.2010 um 15:42 schrieb Abdelrazak Younes: > >> On 09/02/2010 02:19 PM, Stephan Witt wrote: >>> Am 02.09.2010 um 13:42 schrieb Abdelrazak Younes: >>> >>> >>>> On 09/02/2010 12:13 PM, Stephan Witt wrote: >>>> >>>>> I could go back to the row based spell check and abandon the checker >>>>> state of paragraph. >>>>> But I decided to do the move from row to paragraph checking when seeing >>>>> the loop code in >>>>> Buffer.cpp which implements the explicit spell check with F7. >>>>> This code iterates word wise without knowledge of screen rows (of >>>>> course). So the >>>>> performance problem pops up here again. >>>>> >>>>> >>>> Quite franqly I think the spell cache solution that I proposed a while >>>> back would have resolved all performance problems. Pity that I don't have >>>> any time to work on this... >>>> >>> I don't think so. Remember that the checking of the whole visible part of >>> the document word per word - even when it is done only once - is a problem >>> with apples native spell checker. >>> >> >> Well, if it is done once the slowdown will be once for one word. If the word >> is found again in the document (and the probability is quite high) then the >> spell cache will come in play without any help from the slow spellchecker. > > Would be handy to have some statistics of relative frequency of word usage... > I don't have them.
I changed the Paragraph class to do that. The result for the Tutorial is: Paragraph.cpp(3296): paragraph statistics: 5 words, 5 distinct. Paragraph.cpp(3320): register statistics: 5 words, 5 new words. Paragraph.cpp(3296): paragraph statistics: 1 words, 1 distinct. Paragraph.cpp(3320): register statistics: 1 words, 1 new words. Paragraph.cpp(3296): paragraph statistics: 8 words, 8 distinct. Paragraph.cpp(3320): register statistics: 8 words, 8 new words. Paragraph.cpp(3296): paragraph statistics: 27 words, 22 distinct. Paragraph.cpp(3320): register statistics: 22 words, 21 new words. Paragraph.cpp(3296): paragraph statistics: 3 words, 3 distinct. Paragraph.cpp(3320): register statistics: 3 words, 3 new words. ... cut output of ~200 paragraphs Paragraph.cpp(3296): paragraph statistics: 12 words, 10 distinct. Paragraph.cpp(3320): register statistics: 10 words, 0 new words. Paragraph.cpp(3296): paragraph statistics: 1 words, 1 distinct. Paragraph.cpp(3320): register statistics: 1 words, 1 new words. Paragraph.cpp(3296): paragraph statistics: 18 words, 15 distinct. Paragraph.cpp(3320): register statistics: 15 words, 6 new words. Sums up to 2605 words with length >= 6 characters, 872 different words. This is a cache hit ratio of (2605-872)/2605 = 0.66 (66%) Not very impressive... The Users Guide looks better: 11921 words, 1738 misses (11921-1738)/11921 = 0.85 (85%) Better indeed... but I'm not sure the cache solution alone is responsive enough for the user. Maybe one can combine both: cache and paragraph/sentence based checking. >> No, think Copy&Paste, track change, paragraph fusion/deletion, etc. There >> are a lot of corner cases. > > I don't get it. All of these operations are going through official interfaces > of the Paragraph class. > Or is that not true? And the constructors of the Paragraph can invalidate the > copied caches. > > I'd like to check the "paragraph fusion" case. Please, can you point me into > the code where to find it? Now I've checked the Copy&Paste... it works here without change. Change tracking I tested already, is working too. Paragraph deletion is no problem for the spell checker. Paragraph fusion is used by Paste (right?), should work. All these operations are calling Paragraph::insertChar() or Paragraph::insertInset() in the end, AFAICS. Stephan