You want to go for the difference algorithm based on the edit distance measure. It is better suited for this. See this link, which also explains the relation to LCS:
http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Dynamic/Edit/
Since this algorithm has an O(n^2) complexity, I definitely want the diff algorithm, at least for the toplevel paragraph list.
The advantage would be that I don't need to detect mathmode changes etc. from the LyX or LaTeX file. Also I could recurse into the inset hierarchy and maybe detect changes within tables etc.
But to be really correct, you want to go for a tree difference algorithm rather than a linear one. Those can be complex, but there is a
I don't want to be mathematically correct :-) I thought to compare whole paragraphs not for equality but for being "reasonable similar", based on
edit distance. When I merge the two paragraph lists I call the diff algorythm recursively on those paragraphs which are not equal but only
similar. Maybe I could use the edit distance algorithm for short paragraphs.
After looking at the code of DocIterator & Co. I realised that this will not be as simple as I thought.
short-cut, which might be used in 1.5 when we have an XML format: xydiff:
http://www-rocq.inria.fr/gemo/XyDiff/
Interesting.
Thanks for the pointers!
/Andreas