Pavel Sanda schreef:
Vincent van Ravesteijn wrote:
Right.. let's start with the biggest document that is around :)...

To find the perfect solution takes a lot of work, but I'm afraid that it is a bit slow to get to the real document data. So .. it would take a while.. but it's almost finished. After that it should be speed up.

The problem is largest when a large piece of text is added. A lot of small changes will be found very quickly.


will you share which matching algorithm you have chosen?

An O(ND) Difference Algorithm and Its Variations (1986), EW Myers

speed ups for this kind of problem must have been already addressed by people 
working on genetic material. just to avoid reinventing wheel...

It's partly also a LyX problem: "o.text()->getPar(o.pit()).getChar(o.pos())" might not be the fastest way to retrieve the character.

Anyway.. there will probably be ways to speed up this.

Another possibility is not to check each character, but to start with paragraphs/words etc. If a word on average has 5 characters, the algorithm for words is 25 times as fast.
pavel

Vincent

Reply via email to